念念不忘
必会回响

macOS使用brew安装Hadoop

1. 前言

通常我一直在虚拟机中安装Hadoop、Hive、Hbase等大数据相关的组件,但随着我的笔记本使用年限越爱越长,性能也下降的很快,已经达到开虚拟机就会开的地步,于是我就考虑直接将Hadoop安装到macOS上,避免了虚拟机对资源的消耗,为了更好的管理,我选择使用brew来安装。

2. 安装Hadoop

使用brew安装Hadoop非常方便,执行如下命令即可:

brew install hadoop

安装前建议提前安装好JDK,并且配置JAVA_HOME。

为了方便使用,我还配置了hosts

sudo vim /etc/hosts

127.0.0.1       hadoop

3. 配置Hadoop

3.1. 配置免密登录

macOS默认是关闭了远程登入的功能,所以需要先打开它。打开系统偏好设置 -> 共享,左边勾选远程登录,右边选择仅这些用户,并且把当前用户添加进去。

然后生成秘钥和公钥

# 生成公钥和私钥
ssh-keygen -t rsa -P '' -f ~/.ssh/id_rsa 
# 将公钥追加到authorized_keys,实现免密登录
cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
# 修改相关权限,避免不生效
chmod 0600 ~/.ssh/authorized_keys

# 配置完成后ssh localhost测试,是否需要密码。
ssh hadoop

3.2. 修改Hadoop配置文件

brew安装在/usr/local/Cellar/hadoop/3.3.4

修改hadoop-env.sh

vim /usr/local/Cellar/hadoop/3.3.4/libexec/etc/hadoop/hadoop-env.sh

# 配置JAVA_HOME
export JAVA_HOME=/Library/Java/JavaVirtualMachines/jdk-11.0.17.jdk/Contents/Home

修改core-site.xml

vim /usr/local/Cellar/hadoop/3.3.4/libexec/etc/hadoop/core-site.xml

填入如下内容:

<configuration>
  <property>
    <name>fs.default.name</name>
    <value>hdfs://hadoop:8020</value>
  </property>
  <!-- 以下配置可防止系统重启导致NameNode 不能启动-->
  <!-- /Users/用户名/data 这个路径你可以随便配置, hadoop必须有权限-->
  <property>
    <name>hadoop.tmp.dir</name>
    <value>/Users/liyang/Documents/hadoop/tmp</value>
    <description>A base for other temporary directories.</description>
  </property>
  <!-- DataNode存放块数据的本地文件系统路径 -->
  <property>
    <name>dfs.name.dir</name>
    <value>/Users/liyang/Documents/hadoop/filesystem/name</value>
    <description>Determines where on the local filesystem the DFS name node should store the name table. If this is a comma-delimited list of directories then the name table is replicated in all of the directories, for redundancy. </description>
  </property>
  <property>
    <name>dfs.data.dir</name>
    <value>/Users/liyang/Documents/hadoop/filesystem/data</value>
    <description>Determines where on the local filesystem an DFS data node should store its blocks. If this is a comma-delimited list of directories, then data will be stored in all named directories, typically on different devices. Directories that do not exist are ignored.</description>
  </property>
</configuration>

修改hdfs-site.xml

vim /usr/local/Cellar/hadoop/3.3.4/libexec/etc/hadoop/hdfs-site.xml

填入如下内容:

<configuration>
  <property>
    <name>dfs.replication</name>
    <value>1</value>
  </property>
  <property>
      <name>dfs.namenode.secondary.http-address</name>
      <value>hadoop:9868</value>
  </property>
  <property>
      <name>dfs.namenode.http-address</name>
      <value>hadoop:9870</value>
  </property>

  <property>
      <name>dfs.namenode.rpc-address</name>
      <value>hadoop:8020</value>
  </property>
</configuration>

修改mapred-site.xml

vim /usr/local/Cellar/hadoop/3.3.4/libexec/etc/hadoop/mapred-site.xml

填写如下内容:

<configuration>
  <property>
    <name>mapreduce.framework.name</name>
    <value>yarn</value>
  </property>
</configuration>

修改yarn-site.xml

vim /usr/local/Cellar/hadoop/3.3.4/libexec/etc/hadoop/yarn-site.xml

填写如下内容:

<configuration>

<!-- Site specific YARN configuration properties -->
  <property>
    <name>yarn.nodemanager.aux-services</name>
    <value>mapreduce_shuffle</value>
  </property>
  <property>
    <name>yarn.nodemanager.env-whitelist</name>
    <value>JAVA_HOME,HADOOP_COMMON_HOME,HADOOP_HDFS_HOME,HADOOP_CONF_DIR,CLASSPATH_PREPEND_DISTCACHE,HADOOP_YARN_HOME,HADOOP_MAPRED_HOME</value>
  </property>
</configuration>

修改workers

vim /usr/local/Cellar/hadoop/3.3.4/libexec/etc/hadoop/workers

# 默认内容是localhost,可以不改,因为我设置了别的host,所以我需要修改一下
hadoop

3.3. 创建目录

mkdir -p /Users/liyang/Documents/hadoop

4. 启动Hadoop

4.1. 设置环境变量

vim ~/.zshrc

export HADOOP_HOME="/usr/local/Cellar/hadoop/3.3.4/libexec"
export PATH="$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin"
export HADOOP_CONF_DIR="/usr/local/Cellar/hadoop/3.3.4/libexec/etc/hadoop"

4.2. 格式化HDFS

hdfs namenode -format

4.3. 启动所有服务

# 如果start-all提示不存在的命令,那可能是你环境变量没配置,或者配置了没生效。
start-all

会输出如下:

(base) ➜  ~ start-all.sh
WARNING: Attempting to start all Apache Hadoop daemons as liyang in 10 seconds.
WARNING: This is not a recommended production deployment configuration.
WARNING: Use CTRL-C to abort.
Starting namenodes on [hadoop]
Starting datanodes
Starting secondary namenodes [hadoop]
2023-06-20 21:00:36,813 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Starting resourcemanager
Starting nodemanagers

# 查看进程
(base) ➜  ~ jps
14914 NameNode
10739
15526 Jps
15160 SecondaryNameNode
15354 ResourceManager
15018 DataNode
15454 NodeManager

访问 web address查看状态

  • NameNode: http://localhost:9870
  • ResourceManager: http://localhost:8088/cluster
  • NodeManager: http://localhost:8042/node

执行命令测试是否正常

(base) ➜  ~ hadoop fs -ls /
2023-06-20 21:06:47,933 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
(base) ➜  ~ hadoop fs -mkdir /test
2023-06-20 21:06:56,633 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
(base) ➜  ~ hadoop fs -ls /
2023-06-20 21:07:01,184 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Found 1 items
drwxr-xr-x   - liyang supergroup          0 2023-06-20 21:06 /test

这里会输出大量的WARN,表示没有加载本地库,这个解决办法可参考如下:

https://www.cnblogs.com/shoufeng/p/14940245.html

我选择了修改日志,本来只是为了学习使用,没必要追求这些性能。

赞(5) 打赏
未经允许不得转载:堆上小栈 » macOS使用brew安装Hadoop

评论 抢沙发

觉得文章有用就打赏一下文章作者

非常感谢你的打赏,我们将继续提供更多优质内容,让我们一起创建更加美好的网络世界!

支付宝扫一扫

微信扫一扫

登录

找回密码

注册