TableStore actual combat: DLA + SQL Real-Time Analysis TableStore

TableStore实战:DLA+SQL实时分析TableStore

Mysql> Create Database Hangzhou_ots_test with dbproperties
Catalog = ‘ots’,
Location = ‘https://instancename.cn-hangzhou.OTS-Internal.aliyuncs.com’,
Instance = ‘instancename’
);

Query OK, 0 ROWS Affected (0.23 SEC)

#hangzhou_ots_test — Please pay attention to the library name, allow letters, numbers, and underscores
#catalog = ‘ots’, — Specify as OTS, is to distinguish other data sources, such as OSS, RDS, etc.
#location = ‘https: // xxx’ — OTS Endpoint, can be seen from the instance
#instance = ‘HZ-TPCH-1X-VOL’

Mysql> show data;
+—————————- +
| Database |
+—————————- +
| Hangzhou_ots_test |
+—————————- +
1 ROWS IN SET (0.22 SEC)

TableStore实战:DLA+SQL实时分析TableStore

Mysql> Use hengzhou_ots_test;
Database change

mysql> show tables;
EMPTY SET (0.30 SEC)

Mysql> Create External Table `Tablename (
`pk1` VARCHAR (100) Not null,
`pk2` int NULL,
`col1` Varchar (100) null,
`col2` Varchar (100) null,
Primary key (`pk1`,` pk2`
);
Query OK, 0 ROWS Affected (0.36 SEC)

## `Tablename` —- TableStore Corresponding table name (convert to lowercase after the DLA)
## `pk2` Int not null —- If it is the primary key, you must be NOT NULL
## primary key (`pk1`,` pk2`) — Be sure to be the same as the primary key in OTS; the name is also corresponding

mysql> show tables;
+———- +
| TABLE_NAME |
+———- +
| TABLENAME |
+———- +
1 row in set (0.35 sec)

MySQL> Select Count (*) from Tablename;
+—– +
| _COL0 |
+—– +
| 25 |
+—– +
1 row in set (1.19 sec)

First, actual background

What is DLA (DataLake Analytics Data Lake)? He is a server-free interactive query analysis service. As a distributed interactive analysis service, it is one of the important components of the form storage computing ecology. In order to make the user better understand the functionality of DLA, this is created this actual example.

Based on the DLA, do not need to do any ETL, data relocation, etc., realize large data association analysis across a variety of heterogeneous data sources, and support data back to each heteroado data source, thereby saving cost, reducing delay And enhance the user experience.

Based on JDBC, the table stored console directly integrates the SQL query, the data is public instance, and the user can experience the real-time SQL analysis of the table storage without opening the service, and the query function is as follows: __ official website console Address: __ item sample

Demand Scene: Black Five Transaction Data

In this practical case, we get data from https://www.kaggle.com/meldidag/black-friday, store to TableStore, then analyze the DLA, take you to feel the value of the data!

“Black Friday”, “Black Friday” is the most crazy day in Americans, similar to China’s “Double Eleven” Shopping Carnival.

Generally black Friday activities are mainly online, but gradually also have trends in the online development, such as Amazon has online sales activities against black Friday, and it is very similar to Tmall Double Elete. Similarly, such an activity will produce a lot of meaningful business data.

We define a table called BlackFriday50W in DLA, mapped to a table in TableStore, used to describe users to purchase goods.

The table structure of the example data is taken below, and the real data screenshot

Second, Table Storage (TableStore)

TableStore实战:DLA+SQL实时分析TableStore

Ready to work

If you are interested in the function of the TableStore on the DLA real-time online, I hope to start your own system’s construction, just follow the steps you can build:

1. Open Table Storage

The table storage service is opened through the console, and the table is stored, ready to be opened (after paying), using a fee-paying method, has provided the user with a free function test. Table storage official website console, free quota illustration.

2. Create an instance

TableStore实战:DLA+SQL实时分析TableStore

Create a table storage instance through the console.

3, import data

TableStore实战:DLA+SQL实时分析TableStore

This data has a total of 538,000 rows, 12 columns, and we store the full amount of data in the table of TableStore via SDK. Users can insert 2 test data through the console;

Open DLA service

TableStore实战:DLA+SQL实时分析TableStore

DLA service is open

Users enters the product introduction page, choose Open service: https://www.aliyun.com/product/datalakeanalytics

Open TableStore data source via the DLA console

After opening the data source, create a service access point (select a classic network, if you already have VPC, you can choose VPC)

Log in to CMS (the account will send the station message after the service is opened, the message is viewed)

Create a DLA appearance

1) Create your own DLA library (related information is found from the above process):

2) View the library created by yourself:

TableStore实战:DLA+SQL实时分析TableStore

3) View your own DLA table:

4) Creating a DLA table and maps to OTS:

5) View your own table and related DDL statements:

6) Start query and analysis (users can analyze their own data, in line with MySQL syntax)

In this way, a TableStore is created in the associated outer table in DLA, and the user can analyze your TableStore table according to your own needs in real time according to your needs.

TableStore实战:DLA+SQL实时分析TableStore

Third, Table Storage Console Show

The following is the SQL scene provided by the control, the user can write some demand SQL in the case, and try it!

Best-selling TOP10 products and sales

Middle-high-end products account for the proportion of overall GMV

Consumption unit price trends in different age groups

Gender and age trend of high consumption people

Four, welcome to join

In this way, the TableStore implemented based on DLA + SQL is complete, is it simple?

Author: Tongtan