Install Guide Detailed¶
This document is a guide for installing metatron and using data preparation feature from the scratch Linux OS environment (CentOS 7).
1. Install requirements¶
Run following commands by root.
yum clean all && yum repolist && yum -y update
yum -y install tar unzip vi vim telnet apr apr-util apr-devel apr-util-devel net-tools curl openssl elinks locate python-setuptools
yum -y install java-1.8.0-openjdk-devel.x86_64
export JAVA_HOME=/usr/lib/jvm/java
export PATH=$PATH:$JAVA_HOME/bin
2. Install Hadoop¶
Run below commands by root. You’d better to download the Hadoop binary from the closest mirror.
yum -y install openssh-server openssh-clients rsync netstat wget
yum -y update libselinux
ssh-keygen -q -N "" -t dsa -f /etc/ssh/ssh_host_dsa_key
ssh-keygen -q -N "" -t rsa -f /etc/ssh/ssh_host_rsa_key
ssh-keygen -q -N "" -t rsa -f /root/.ssh/id_rsa
cp /root/.ssh/id_rsa.pub /root/.ssh/authorized_keys
wget http://archive.apache.org/dist/hadoop/common/hadoop-2.7.3/hadoop-2.7.3.tar.gz
tar -zxvf hadoop-2.7.3.tar.gz -C /opt
rm -f hadoop-2.7.3.tar.gz
ln -s /opt/hadoop-2.7.3 /opt/hadoop
export HADOOP_PREFIX=/opt/hadoop
export HADOOP_COMMON_HOME=$HADOOP_PREFIX
export HADOOP_HDFS_HOME=$HADOOP_PREFIX
export HADOOP_MAPRED_HOME=$HADOOP_PREFIX
export HADOOP_YARN_HOME=$HADOOP_PREFIX
export HADOOP_CONF_DIR=$HADOOP_PREFIX/etc/hadoop
export YARN_CONF_DIR=$HADOOP_PREFIX
export PATH=$PATH:$HADOOP_PREFIX/bin:$HADOOP_PREFIX/sbin
sed -i "/^export JAVA_HOME/ s:.*:export JAVA_HOME=$JAVA_HOME:" $HADOOP_CONF_DIR/hadoop-env.sh
sed -i "/^export HADOOP_CONF_DIR/ s:.*:export HADOOP_CONF_DIR=$HADOOP_CONF_DIR:" $HADOOP_CONF_DIR/hadoop-env.sh
Put files below into $HADOOP_CONF_DIR.
Run followings by root.
$HADOOP_PREFIX/bin/hdfs namenode -format
Append following contents into /root/.ssh/config
Host *
UserKnownHostsFile /dev/null
StrictHostKeyChecking no
LogLevel quiet
Port 2122
Run followings by root.
chmod 600 /root/.ssh/config
chown root:root /root/.ssh/config
chmod +x $HADOOP_CONF_DIR/*-env.sh
sed -i "/^[^#]*UsePAM/ s/.*/#&/" /etc/ssh/sshd_config
echo "UsePAM no" >> /etc/ssh/sshd_config
echo "Port 2122" >> /etc/ssh/sshd_config
Restart SSH server.
service sshd restart
Run HDFS and Yarn daemons.
start-dfs.sh
start-yarn.sh
Test if Hadoop works fine.
hdfs dfs -mkdir -p /user/hadoop/input
hdfs dfs -put $HADOOP_PREFIX/LICENSE.txt /user/hadoop/input
hadoop jar $HADOOP_PREFIX/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.3.jar wordcount /user/hadoop/input /user/hadoop/output
3. Install MySQL¶
wget http://dev.mysql.com/get/mysql57-community-release-el7-7.noarch.rpm \
&& yum -y localinstall mysql57-community-release-el7-7.noarch.rpm \
&& yum repolist enabled | grep "mysql.*-community.*" \
&& yum -y install mysql-community-server mysql \
&& rm -f mysql57-community-release-el7-7.noarch.rpm
service mysqld start
Get the temporary password with the following command.
grep 'temporary password' /var/log/mysqld.log | awk {'print $11'}
Z&O+estx9vTt
Run mysql_secure_installation with the temporary password.
mysql_secure_installation
Enter password for user root: -> Z&O+estx9vTt
New password: -> Metatron123$
Re-enter new password: -> Metatron123$
Change the password for root ? ((Press y|Y for Yes, any other key for No) : y
New password: -> Metatron123$
Re-enter new password: -> Metatron123$
Do you wish to continue with the password provided? -> y
Remove anonymous users? -> enter
Disallow root login remotely? -> enter
Remove test database and access to it? -> enter
Reload privilege tables now? -> enter
Connect to MySQL.
mysql -uroot -pMetatron123$
4. Install Hive¶
wget http://mirror.navercorp.com/apache/hive/hive-2.3.6/apache-hive-2.3.6-bin.tar.gz \
&& tar -zxvf apache-hive-2.3.6-bin.tar.gz -C /opt \
&& rm -f apache-hive-2.3.6-bin.tar.gz \
&& ln -s /opt/apache-hive-2.3.6-bin /opt/hive
export HIVE_HOME=/opt/hive
export PATH=$PATH:$HIVE_HOME/bin:$HIVE_HOME/hcatalog/sbin
wget https://repo1.maven.org/maven2/mysql/mysql-connector-java/5.1.38/mysql-connector-java-5.1.38.jar
mv mysql-connector-java-5.1.38.jar $HIVE_HOME/lib/
Put files below into $HIVE_HOME/conf.
Initialize the Hive metastore.
mysql -uroot -pMetatron123$
create database hive_metastore;
create user 'hive'@'%' identified by 'Metatron123$';
grant all privileges on *.* to 'hive'@'%';
grant all privileges on hive_metastore.* to 'hive'@'%';
create user 'hive'@'localhost' identified by 'Metatron123$';
grant all privileges on *.* to 'hive'@'localhost';
grant all privileges on hive_metastore.* to 'hive'@'localhost';
flush privileges;
quit
schematool -initSchema -dbType mysql
Start Hive.
hdfs dfs -mkdir -p /user/hive/warehouse
mkdir -p $HIVE_HOME/hcatalog/var/log
hcat_server.sh start
hiveserver2 &
Connect to Hive.
beeline -u jdbc:hive2://localhost:10000 "" ""
5. Install Druid¶
wget https://sktmetatronkrsouthshared.blob.core.windows.net/metatron-public/discovery-dist/latest/druid-0.9.1-latest-hadoop-2.7.3-bin.tar.gz
mkdir /servers
tar zxf druid-0.9.1-latest-hadoop-2.7.3-bin.tar.gz -C /servers
ln -s /servers/druid-* /servers/druid
export DRUID_HOME=/servers/druid
Put files below into each target locations.
Download URL |
Target Location |
---|---|
$DRUID_HOME/conf/druid/single/jvm.config |
|
$DRUID_HOME/conf/druid/single/broker/runtime.properties |
|
$DRUID_HOME/conf/druid/single/historical/runtime.properties |
|
$DRUID_HOME/conf/druid/single/middleManager/runtime.properties |
cd $DRUID_HOME
./start-single.sh
Check if you connect to http://localhost:8090/
6. Install Metatron¶
wget https://sktmetatronkrsouthshared.blob.core.windows.net/metatron-public/discovery-dist/latest/metatron-discovery-latest-bin.tar.gz
mkdir /servers
tar zxf metatron-discovery-latest-bin.tar.gz -C /servers
ln -s /servers/metatron-discovery-* /servers/metatron-discovery
export METATRON_HOME=/servers/metatron-discovery
Put files below into $METATRON_HOME/conf.
Initialize Metatron.
mysql -uroot -pMetatron123$
create database polaris;
create user 'polaris'@'%' identified by 'Metatron123$';
grant all privileges on *.* to 'polaris'@'%';
grant all privileges on hive_metastore.* to 'polaris'@'%';
create user 'polaris'@'localhost' identified by 'Metatron123$';
grant all privileges on *.* to 'polaris'@'localhost';
grant all privileges on hive_metastore.* to 'polaris'@'localhost';
flush privileges;
quit
cd $METATRON_HOME
bin/metatron.sh --init start
To watch the progress, tail the log file.
tail -f logs/metatron-*.out
Connect to http://localhost:8180/
7. Install Preptool¶
yum -y install https://centos7.iuscommunity.org/ius-release.rpm \
&& yum install -y python36u python36u-libs python36u-devel python36u-pip git \
&& ln -s /bin/python3.6 /bin/python3 \
&& ln -s /bin/pip3.6 /bin/pip3 \
&& pip3 install requests
yum -y install git
git clone https://github.com/metatron-app/discovery-prep-tool.git
cd discovery-prep-tool
Download a test file.
python3 preptool -f sales-data-sample.csv
If you get “File dataset created”, then it works.