Explore, transform, validate and integrate your data about this book * Manipulate your data by exploring, transforming, validating and integrating it using Pentaho Data Integration 7 * Leverage the amazing innovations of Big Data using Apache Spark with Pentaho Data Integration 7 * A comprehensive guide to exploring the features of Pentaho Data Integration 7 Who is this book is a must-have book for software developers,
business intelligence analysts, IT students and all those involved or interested in the development of ETL solutions, or in general, doing any kind of data manipulation.
Those who have never used POIs will benefit as much as possible from the book, but those who have it will also find it useful.
This book is also a good starting point for data warehouse designers, architects or anyone who is responsible for data warehouse projects and needs to load data into them.
What you'll learn * Install and start with Pentaho data integration * Learn the ins and outs of Spoon, the graphing tool * Transform data in a variety of ways, such as performing simple and complex calculations, cleaning, counting, deduplication, filtering, and sorting * Learn how to obtain data from all types of data sources such as flat files, Excel spreadsheets, databases, and XML files * Use Pentaho data integration to perform CRUD (create),
Use Pentaho Data Integration to organize files and folders, run daily processes, handle errors, and more In detail This book shows and explains the new interactive features of Spoon, the new look and feel, and the new features of the tool, including Transformations and Jobs Executors and the invaluable metadata injection capability.
We start with the installation of the POI software and then move on to covering all the key POI concepts.
Each chapter introduces new features, allowing you to gradually get involved with the tool.
First, you will learn how to perform all kinds of data manipulation and how to work with flat files.
The book then gives you a database primer and teaches you how to work with databases within POIs.
In addition, you will learn about data warehouse concepts and how to load data into a data warehouse.
Finally, you will have the opportunity to apply and reinforce all the concepts learned through the implementation of a simple datamart.
During the course of this book, you will become familiar with the intuitive, graphical, drag-and-drop design environment.
At the end of this book, you will learn everything you need to know to meet your data manipulation requirements.