How to Install Storm: A Step-by-Step Tutorial
How to Install Storm: A Step-by-Step Tutorial
Apache Storm is a versatile and widely-used distributed real-time computation system, perfect for processing large streams of data with low latency. This tutorial will guide you through the process of installing Storm on a Linux system, either for development or production use. We will cover the prerequisites, installing dependencies, setting up Storm and Zookeeper, and verifying the installation.
Prerequisites
- Linux-based operating system (Ubuntu, CentOS, Debian, etc.)
- Java Development Kit (JDK) 8 or higher installed
- Apache Zookeeper installed and running (Storm uses Zookeeper for coordination)
- Basic command-line knowledge
- Network connectivity to download required packages
Step 1: Install Java Development Kit (JDK)
Apache Storm requires Java because it runs on the JVM. To check if Java is installed, run:
java -version
If Java is not installed, follow these commands for Ubuntu/Debian based systems:
sudo apt update
sudo apt install openjdk-8-jdk -y
For CentOS/Fedora:
sudo yum install java-1.8.0-openjdk-devel -y
Verify the installation again with java -version.
Step 2: Install Apache Zookeeper
Zookeeper is necessary for managing the Storm cluster state.
To install Zookeeper on Ubuntu:
sudo apt install zookeeperd -y
Start and enable Zookeeper if not running:
sudo systemctl start zookeeper
sudo systemctl enable zookeeper
Verify Zookeeper is running:
sudo systemctl status zookeeper
Step 3: Download and Extract Apache Storm
Visit the Apache Storm official site to get the latest release. Use wget for downloading (e.g., version 2.4.0):
wget https://downloads.apache.org/storm/apache-storm-2.4.0/apache-storm-2.4.0.tar.gz
Extract the tarball to /opt:
sudo tar -xvzf apache-storm-2.4.0.tar.gz -C /opt/
Step 4: Configure Environment Variables
For convenience, create symbolic links and add Storm to your PATH:
sudo ln -s /opt/apache-storm-2.4.0 /opt/storm
export STORM_HOME=/opt/storm
export PATH=$PATH:$STORM_HOME/bin
Add the above environment variables to ~/.bashrc or /etc/profile to load them every session.
Step 5: Configure Storm
Edit the Storm configuration file /opt/storm/conf/storm.yaml with your preferred editor.
Minimal example configuration:
storm.zookeeper.servers:
- "localhost"
nimbus.seeds: ["localhost"]
storm.local.dir: "/var/storm"
supervisor.slots.ports:
- 6700
- 6701
- 6702
- 6703
This example assumes a single-node Storm cluster. For a multi-node cluster, update hostnames accordingly.
Step 6: Start Storm Services
Storm has several daemon processes:
nimbus: master nodesupervisor: manages workersui: web UI
Start each in a separate terminal or create system service scripts for production. To start manually:
storm nimbus
storm supervisor
storm ui
The Storm UI is accessible at http://localhost:8080 by default.
Troubleshooting Tips
- If Storm cannot connect to Zookeeper, ensure Zookeeper is running and the host is reachable.
- Verify Java version compatibility.
- Check
storm.yamlcarefully for YAML syntax errors. - Firewall settings might need adjustment to allow Storm ports.
- Logs located in
$STORM_HOME/logscan help diagnose problems.
Summary Checklist
- âś… Installed JDK 8 or newer
- âś… Installed and running Zookeeper
- âś… Downloaded and extracted Apache Storm
- âś… Configured environment variables
- âś… Edited
storm.yamlwith correct settings - âś… Started Storm daemons and verified UI access
For more detailed insights on cluster management and real-time data processing pipelines, you can also explore our tutorial on How to Install Apache Flink, another powerful stream processing tool.
