What is PENTAHO?
Is an Open Source tool, founded in 2001 and uses a GUI tool for you to build and operate your data ETL – they have Community and commercial versions, and you can use Java to develop Engine of this product. This is a relatively complete tool for ETL, Warehouse organization, and building BI analysis reports. Community Edition currently has 13,500 Registers
In this article, I share with you how to install Pentaho tool. Well, why is Pentaho Data Integration aka “Super Kettle”? Pentaho Data Integration (PDI) is an ETL (Extract, Transform, Load) tool for managing data ingest pipelines. As we generate more and more data across different sources and formats, it becomes difficult to manage data pipelines to make better decisions.
PDI is a useful tool to manage such pipelines seamlessly. I will be writing a series of blogs explaining the end-to-end process of creating configurable data entry pipelines to manage multiple data structures and formats. We’ll start with the pre-installation and finish with the deployment.
Pentaho will include 2 versions, Enterprise and Community. In this article we will install the Community version.
Computer conditions
Processor: Intel EM64T or AMD64 Dual-Core
RAM: 8 GB with 2 GB dedicated to PDI – It can also work on 4GB RAM systems
Disk space: 20 GB free after installation
Screen Size: 1280x 960 – Easy Viewing with PDI UI
[Clip Detailed installation instructions Pentaho] ( https://www.youtube.com/watch?v=u7COUgoLo6I )
Step 1: Download PDI-CE from SourceForge link.
The latest version of PDI is 9.3, you can download the latest stable version as per your requirement. The filename is “pdi-ce-9.3.0.0–428.zip”.
Step 2: Download and Install Java
Download Java SE Development Kit 8 from the official website. Because PDI is built with Java as a back-end programming language. Download the version as shown in the image below. You will be prompted by Oracle to register with basic information.
Step 3: Extract the pdi-ce-9.3.0.0–428.zip file in a setup directory.
You should store it in a non-C drive (Because the size of the file is more than 1GB). It’s best to create an “Applications” folder in the “D” drive and store all third-party apps in the same folder. Let’s go with the approach here. There is no executable file (.exe) that we need to run to install PDI, just extract the .zip file. Easily!
Step 4 : The installation is complete.
>>> Read more:
SQL DATA QUESTION AND MANUFACTURING COURSE FROM BASIC TO ADVANCED
COURSE DATA WAREHOUSE : Synthesize, standardize and build a data warehouse in the enterprise
DATA MODEL COURSE – DESIGNING DATA MODEL IN ENTERPRISE
THE ROAD TO BECOME A DATA ENGINEER FOR STARTERS
WHAT IS DATA ENGINEER? DATA ENGINEER’s MAIN WORK? NEEDED SKILLS