Join us at the World Data Summit in Amsterdam,...
Big Data Analytics WhereScape RED
Big Data and Advanced Analytics using WhereScape RED
We at WhereScape have been making significant investments recently integrating our automation software WhereScape RED with big data platforms. There has been a lot of interest from customers who recognise the value that automation can bring to these powerful yet complex solutions.
We have recently implemented a financial forecasting solution at an enterprise customer to prove the value of big data technology within the organisation. The solution developed in partnership between WhereScape and the customer delivered multiple benefits:
- Delivered a high value forecasting model to the business
- Used big data technology to deliver a model where traditional relational technology failed
- Proved the value of big data technology within the organisation
- Highlighted the potential of big data technologies to solve new and interesting problems
Problem
The customer needed to generate an accurate set of key performance indicators each month. The process required large daily volumes (approx. 20 million rows per day) of customer and detailed product revenue data. Because of storage and processing limitations, data was only available for the current month which impacted the forecast accuracy.
Solution
To solve the storage and processing problem the customer decided to implement a Cloudera Hadoop big data platform to store full historical datasets, along with the WhereScape RED for Big Data Adaptor to enable data lake automation. The completed solution delivered the following functionality:
- An automated process to extract large volumes of daily transactions from source each day and save them to the Cloudera platform using Hive
- Based on the extracted data, a forecast model was built in Hive and SQL Server. This model was built using WhereScape RED via rapid, iterative development cycle
- Once the forecast numbers were prepared, an automated process saved the forecast as an incremental snapshot in Hive and refreshed Cloudera Impala for interactive querying via Tableau
Some of the key features / benefits of using WhereScape RED for Big Data are:
- Ability to transfer large amounts of data from source system in to Hive seamlessly
- Common metadata across the Extended Data Warehouse environment (Hive, SQL etc)
- Consistent tools for developers
- Easily generate DDL and ELT SQL for Hive with data movement using Sqoop
- Centralised audit and error logging
- Integrated documentation across the full environment (i.e. SQL Server and Hive)
- Automated and integrated scheduling and workflow engine across the full environment
Solution Overview
Outcome
The project sponsors (both Business and IT) were impressed with how easily and quickly WhereScape RED could deliver a solution to solve their big data problem. Now that data is easily accessible for several months, business stakeholders are excited about their ability to easily generate accurate forecasts in less than 60 minutes as opposed to several days. Financial analysts can also query the big data platform directly via Tableau to rapidly gain insight, without the need to wait for data to be transferred to a relational data warehouse and enterprise reporting suite.
IT have learned that the WhereScape Big Data Adaptor and Cloudera Big Data platform can be used to solve complex and valuable business problems. It was also proved that big data technologies from WhereScape and Cloudera are functional and robust, and means that that projects previously deemed impossible can be looked at again.
Technologies Used
To build this solution, following technologies were used:
- Oracle
- SQL Server
- WhereScape RED
- WhereScape RED Big Data Adaptor
- Cloudera Big Data Platform
- Hive
- Sqoop
- Impala
About WhereScape RED for Big Data
WhereScape RED is data warehouse and big data automation software for building, deploying and renovating your analytic solutions whatever the size of your data. WhereScape RED sets the standard for delivery speed using familiar industry standards, frameworks and best practices to dramatically accelerate time to value.
WhereScape RED customers are able to fully manage their Apache Hive™ big data environments through the WhereScape RED data automation platform. This centralises development of the entire decision support infrastructure into one integrated platform and toolset.
There is no need to license separate ETL, data integration, or data modelling tools because WhereScape RED supports industry standard SQL. Customers can leverage their existing resources and training rather than having to rely on tool or platform-specific expertise.
Maximizing Data Potential: Microsoft Fabric and WhereScape in Harmony
Forget gold. Forget oil. We are living in an era where data is our most precious commodity. As organizations strive for deeper insights into how their products perform, how their brand is perceived, and how customers behave, the need for stronger and more efficient...
Efficient Processing Techniques for JSON and Parquet Semi-Structured Data
Introduction to Semi-Structured Data and Its Importance Semi-structured data sits on the spectrum somewhere between traditional database tables and unstructured data. It has organizational properties that make it easier to analyze than raw text, but it doesn’t fit...
A Webinar Recap: Exploring Data Automation Levels with Kent Graziano
Our most recent webinar, "The Future of Data Warehousing: Understanding Automation Levels," hosted by Patrick O'Halloran, Solutions Architect, and esteemed guest speaker Kent Graziano dove into the transformative world of data warehouse automation. They discussed its...
WhereScape’s Supported Platforms: Accelerating Data Solutions Across the Board
The Future of Data Warehouse Automation with WhereScape Data warehouse automation represents a transformative shift in how businesses manage and utilize their data. WhereScape is at the forefront of this movement, offering tools that automate code generation,...
Overcoming Challenges with AI Hallucinations
Conversing with your digital assistant on your smartphone, using facial recognition for security, traveling in autonomous vehicles, or browsing recommended products based on your search history - there is no denying AI is embedded in many aspects of our lives. AI has...
Navigating Data Governance with WhereScape 3D
Properly managing and organizing data allows businesses to not only understand crucial patterns and trends, but also to leverage that data in strategic ways that grow revenue over time. Data drives decision-making and paves the way for innovation when used properly....
Deep Dive into WhereScape RED: Features and Benefits
Transforming a business’s various databases and files into actionable insights and reports is crucial, but incredibly time-consuming with traditional tools. Fortunately, with data warehouse automation tools like WhereScape RED, organizations can take advantage of a...
Brief Insights from Gartner® Latest Report on Data Fabric and Data Mesh
In the rapidly evolving world of data management, distinguishing between the myriad of strategies and technologies can be daunting. The latest Gartner® report, "How Are Organizations Overcoming Issues to Start Their Data Fabric or Mesh?" provides critical insights...
ETL vs ELT: What are the Differences?
In data management, the debate between ETL and ELT strategies is at the forefront for organizations aiming to refine their approach to handling vast amounts of data. Each method, ETL vs ELT, offers a unique pathway for transferring raw data into a warehouse, where it...
Embracing the Future of Data Management Recap: Insights from Mike Ferguson
In our recent webinar, "Embrace the Future of Data Management with Automated Cloud Data Warehousing," we had the privilege of diving into the transformative world of cloud data warehousing and highlighting the pivotal role of automation. Guided by our own Brad Kloth,...
Related Content
Maximizing Data Potential: Microsoft Fabric and WhereScape in Harmony
Forget gold. Forget oil. We are living in an era where data is our most precious commodity. As organizations strive for deeper insights into how their products perform, how their brand is perceived, and how customers behave, the need for stronger and more efficient...
Efficient Processing Techniques for JSON and Parquet Semi-Structured Data
Introduction to Semi-Structured Data and Its Importance Semi-structured data sits on the spectrum somewhere between traditional database tables and unstructured data. It has organizational properties that make it easier to analyze than raw text, but it doesn’t fit...
A Webinar Recap: Exploring Data Automation Levels with Kent Graziano
Our most recent webinar, "The Future of Data Warehousing: Understanding Automation Levels," hosted by Patrick O'Halloran, Solutions Architect, and esteemed guest speaker Kent Graziano dove into the transformative world of data warehouse automation. They discussed its...
WhereScape’s Supported Platforms: Accelerating Data Solutions Across the Board
The Future of Data Warehouse Automation with WhereScape Data warehouse automation represents a transformative shift in how businesses manage and utilize their data. WhereScape is at the forefront of this movement, offering tools that automate code generation,...