Create Hive UDF with IntelliJ

Posted by Bourne's Blog - A Full-stack & Web3 Developer on April 21, 2022

1. Create a maven project

  • Don’t choose any template from the first dialog.
  • Enter project name, such as “yukun-udf”;
  • Add your identifier to group id like “org.yukun”.

2. Add dependency

  • Create “lib” folder under project root folder, copy hive-exec-3.1.2.jar from $HIVE_HOME/lib/ to here.
  • Add lib to librarie(Open File > Project Structure > Project Settings > Libraries).
  • Add code in pom.xml:
    1
    2
    3
    4
    5
    6
    7
    
      <dependencies>
          <dependency>
              <groupId>org.apache.hive</groupId>
              <artifactId>hive-exec</artifactId>
              <version>1.2.1</version>
          </dependency>
      </dependencies>
    

3. Complete UDF

Create a Java class on src/main/java folder, enter name: org.yukun.YouClassName, here I create a class “org.yukun.AddFunc”;

Open this AddFunc class, import UDF class, and add your business logic in function “evaluate” :

1
2
3
4
5
6
7
8
9
10
11
12
package org.yukun;
import org.apache.hadoop.hive.ql.exec.UDF;

public class AddFunc extends UDF{
    public int evaluate (int a, int b) {
        return a+b;
    }

    public int evaluate (int a, int b, int c) {
        return a+b+c;
    }
}

4. Build jar

  • Add Artifacts (Open File > Project Structure > Project Settings > Artifacts > Create -> Jar -> From modules with dependencies).
  • Build > Build Artifacts > Build, this action will create a jar under out/artifacts/yukun_udf_jar

5. Test

  • Copy jar to hadoop001
  • Add jar resource and register temporary UDF;
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    
    $ hive
    hive> add jar /opt/udf/yukun-udf.jar;
    Added [/opt/udf/yukun-udf.jar] to class path
    Added resources: [/opt/udf/yukun-udf.jar]
    hive> create temporary function addfc as 'org.yukun.AddFunc';
    OK
    hive> select addfc(1,2);
    OK
    _c0
    3
    Time taken: 2.017 seconds, Fetched: 1 row(s)
    hive> select addfc(1,2,3);
    OK
    _c0
    6
    

6. Create permanent UDF

  • put jar into hdfs
    1
    2
    3
    4
    5
    
    $ hdfs dfs -mkdir -p /user/hive/func/lib
    $ hdfs dfs -put /opt/udf/yukun-udf.jar /user/hive/func/lib
    $ hdfs dfs -ls /user/hive/func/lib
    Found 1 items
    -rw-r--r--   3 root supergroup        662 2022-04-21 17:16 /user/hive/func/lib/yukun-udf.jar
    
  • register UDF
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    
    $ hive
    hive> create function addfc as 'org.yukun.AddFunc' using jar 'hdfs://hadoop001:9000/user/hive/func/lib/yukun-udf.jar';
    Added [/opt/module/apache-hive-3.1.2-bin/tmp/e243ba39-fd2e-4acd-85b7-5915f604026c_resources/yukun-udf.jar] to class path
    Added resources: [hdfs://hadoop001:9000/user/hive/func/lib/yukun-udf.jar]
    OK
    hive> select addfc(3,4);
    OK
    _c0
    7
    Time taken: 1.994 seconds, Fetched: 1 row(s)
    hive> select addfc(3,4,5);
    OK
    _c0
    12
    Time taken: 0.083 seconds, Fetched: 1 row(s)
    

    Great!