DATA 전문가로 가는 길

[Admin][MariaDB] JSON 테이블에 적재(Json Table Type) 본문

Data Architecture/Admin

[Admin][MariaDB] JSON 테이블에 적재(Json Table Type)

EstenPark 2016. 4. 18. 08:59

Python을 이용해서 주식 데이터를 가져올 경우 보통 JSON 포멧으로 된 것을 확인 할 수 있습니다. 따라서 JSON을 파싱해서 테이블에 저장할 수도 있지만, 이번에는 CONNECT 엔젠과 JSON 타입을 이용해서 빠르게 적재하는 방법을 확인 해보도록 하겠습니다.


connect-json-table-type 가이드 바로기기


1. connect Engine 설치


# install MariaDB-connect-engine
[root@localhost ~]# yum install MariaDB-connect-engine
# configure SELinux to allow outgoing connection form mysqld to allow CONNECT to work
[root@localhost ~]# setsebool -P mysql_connect_any 1
# MariaDB DB Start
[root@localhost ~]# service maria start
Starting maria (via systemctl):  [  OK  ]
# install "INSTALL SONAME 'ha_connect'" MariaDB 권한을 가진 USER로 접속해야 가능함.
[root@localhost ~]# mysql -uroot -pmaria
Welcome to the MariaDB monitor.  Commands end with ; or \g.
Your MariaDB connection id is 17
Server version: 10.1.13-MariaDB MariaDB Server

Copyright (c) 2000, 2016, Oracle, MariaDB Corporation Ab and others.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

MariaDB [(none)]> INSTALL SONAME 'ha_connect'
    -> ;
Query OK, 0 rows affected (0.01 sec)

MariaDB [(none)]> SHOW ENGINES ;
+--------------------+---------+--------------------------------------------------------------------------------------------------+--------------+------+------------+
| Engine             | Support | Comment                                                                                          | Transactions | XA   | Savepoints |
+--------------------+---------+--------------------------------------------------------------------------------------------------+--------------+------+------------+
| MRG_MyISAM         | YES     | Collection of identical MyISAM tables                                                            | NO           | NO   | NO         |
| CSV                | YES     | CSV storage engine                                                                               | NO           | NO   | NO         |
| Aria               | YES     | Crash-safe tables with MyISAM heritage                                                           | NO           | NO   | NO         |
| MyISAM             | YES     | MyISAM storage engine                                                                            | NO           | NO   | NO         |
| CONNECT            | YES     | Management of External Data (SQL/MED), including many file formats                               | NO           | NO   | NO         |
| MEMORY             | YES     | Hash based, stored in memory, useful for temporary tables                                        | NO           | NO   | NO         |
| InnoDB             | DEFAULT | Percona-XtraDB, Supports transactions, row-level locking, foreign keys and encryption for tables | YES          | YES  | YES        |
| SEQUENCE           | YES     | Generated tables filled with sequential values                                                   | YES          | NO   | YES        |
| PERFORMANCE_SCHEMA | YES     | Performance Schema                                                                               | NO           | NO   | NO         |
+--------------------+---------+--------------------------------------------------------------------------------------------------+--------------+------+------------+
9 rows in set (0.00 sec)

MariaDB [(none)]> SHOW PLUGINS;
+-------------------------------+----------+--------------------+---------------+---------+
| Name                          | Status   | Type               | Library       | License |
+-------------------------------+----------+--------------------+---------------+---------+
| binlog                        | ACTIVE   | STORAGE ENGINE     | NULL          | GPL     |
| mysql_native_password         | ACTIVE   | AUTHENTICATION     | NULL          | GPL     |
| mysql_old_password            | ACTIVE   | AUTHENTICATION     | NULL          | GPL     |
..생략..
| CONNECT                       | ACTIVE   | STORAGE ENGINE     | ha_connect.so | GPL     |
+-------------------------------+----------+--------------------+---------------+---------+

# 링크 확인 아래와 같이 되어 있을 경우 정상
[root@localhost lib64]# ls -al |grep libodbc.so.2.0.0
lrwxrwxrwx.  1 root root        16  4월  5 14:58 libodbc.so -> libodbc.so.2.0.0
lrwxrwxrwx.  1 root root        16  4월  5 14:58 libodbc.so.2 -> libodbc.so.2.0.0
-rwxr-xr-x.  1 root root    429728  8월  6  2015 libodbc.so.2.0.0

# 링크가 없을 경우 링크 생성
[root@localhost ~]#  cd /usr/lib64
[root@localhost lib64]# ln -s libodbc.so.2.0.0 libodbc.so.1


2. JSON Table Type 생성 및 조회


 
# file_name에 디렉토리 경로를 지정하지 않을 경우 데이터베이스 안에 파일로 저장 됨.
[root@localhost estdb]# pwd
/var/lib/mysql/estdb
[root@localhost estdb]# vi json_sample_data.json 
{ "items": [
{ "id": 0, "value": "a" },
{ "id": 1, "value": "b" },
{ "id": 2, "value": "c" },
{ "id": 3, "value": "d" },
{ "id": 5, "value": "e" },
{ "id": 6, "value": "f" },
{ "id": 7, "value": "g" },
{ "id": 8, "value": "h" },
{ "id": 9, "value": "i" },
{ "id": 10, "value": "j" },
{ "id": 11, "value": "k" },
{ "id": 12, "value": "l" },
{ "id": 13, "value": "m" }
] }

[root@localhost ~]# mysql -uroot -pmaria estdb

MariaDB [estdb]> create table est_json (
    -> id int, 
    -> value char(1)) engine=connect, 
    -> table_type=json, 
    -> file_name='json_sample_data.json'
    -> ;

MariaDB [estdb]> select * from est_json;
+------+-------+
| id   | value |
+------+-------+
|    1 | b     |
|    2 | c     |
|    3 | d     |
|    5 | e     |
|    6 | f     |
|    7 | g     |
|    8 | h     |
|    9 | i     |
|   10 | j     |
|   11 | k     |
|   12 | l     |
|   13 | m     |
+------+-------+
12 rows in set (0.00 sec)

# 50000여건 정상적으로 로딩 완료
MariaDB [estdb]> select count(*) from est_json;
+----------+
| count(*) |
+----------+
|    51192 |
+----------+
1 row in set (0.05 sec)

# 161188건 정상적으로 로딩 완료
MariaDB [estdb]> select count(*) from est_json;
+----------+
| count(*) |
+----------+
|   161188 |
+----------+
1 row in set (0.14 sec)

# CTAS를 이용해서 데이터 이관 작업
MariaDB [estdb]> create table est_json_ctas as select * from est_json;
Query OK, 161188 rows affected (1.12 sec)
Records: 161188  Duplicates: 0  Warnings: 0

# 정상적으로 이관 확인
MariaDB [estdb]> select count(*) from est_json_ctas;
+----------+
| count(*) |
+----------+
|   161188 |
+----------+
1 row in set (0.05 sec)

# 대략 4.5MB 이상 넘어갈 경우 SELECT할 때 에러가 날 수 있습니다.
[root@localhost estdb]# ls -alSh |grep json_sample_data.json
-rw-r--r--. 1 mysql mysql 4.5M  4월 18 10:30 json_sample_data.json


3. Meanwhile, the only workaround is to increase the memory used by CONNECT. For instance doing:


 
set global connect_work_size=150000000;


Comments