Bourne's Blog - A Full-stack & Web3 Developer

「big data era」

Hbase Notes

HBase HMaster What’s the function of HMaster? HMaster is responsible for region assignment as well as DDL (create, delete tables) operations. There are two main responsibilities of HMaster: Coo...

Flink Tasks: Top Hot Products/Page View

1. Top Hot Products Get top hot products according log continuously. The log has such columns: userId,productId,categoryId,behavior,timeStamp,sessionId, and the sample data looks like: 1 2 3 1021...

Flink Realtime User Analyses

1. Requirement Analysis of the distribution of users by province, age group and sex, based on user realtime registration data from a website. The sample of desensitised data looks like: 1 2 3 {"A...

Spark User Analyses

1. Requirement Analysis of the distribution of users by province, age group and sex, based on user registration data from a website. The sample of desensitised data looks like: 1 2 3 {"AGE":48,"B...

Flink Realtime Word Counting

Word Count Create a maven project, add org.apache.flink dependency in pom.xml. the full file can be downloaded at the end of this article. Task 1 Create a package named “com.example”, and then cr...

Kafka Tutorial

1. Install Download Kafka from Kafka.apache.org, and extract to /opt/module. 1 [root@hadoop001 software]# tar xvf kafka_2.11-2.4.1.tgz -C /opt/module/ 2. Configuration Go to $KAFKA_HOME/conf/ dir...

Kafka Notes

Overall Kafka起源于LinkedIn公司,用于对各业务系统的基础指标(内存/CPU/磁盘/网络等)和应用指标数据进行分析,自定义开发系统实现逐渐不能满足。 随着数据增长,业务需求复杂度提高,自定义开发问题越来越多。 逐渐进化成一个技能满足实时处理,又支持水平扩展的消息系统-Kafka。 是一个发布-订阅式的队列消息系统,使用scala语言编写,非常适合离线、在线消息消费。消息存储...

Flink Tutorial

1 Install Download flink from apache, extract to /opt/module . 2 Configuration 2.1 Set environment variables 1 2 3 4 5 [root@hadoop001 ~]# cd /opt/module/flink-1.12.7/ [root@hadoop001 flink-1.12.7...

Spark Machine Learning

1 Core concept Transformer, implement transform() method which can transform a dataframe to another dataframe with more columns; Estimator, implement fit() method which manipulate a dataframe ...

Spark SQL Practice

Spark SQL Practice 1. Features of Spark SQL Integrated SQL queries with spark program Uniform Data access Hive Compatibility Standard Connectivity(jdbc/odbc) 2. Create Table 2.1 case cla...