
What to Know
This guide is for beginners and those who are ready to muddle through the jargon. Recommended reading links are provided to help you learn more.

Analytical Methods
Operational Reporting
Transaction systems offer required reporting for operational, regulatory, and compliance purposes. These are engineered to preserve the performance of the system.
​
Cross-System Analytics
This can be in the form of longitudinal analysis across systems and data science on one or multiple datasets.
.
Cross-Era Analytics
Analysis on a domain of data across multiple ERPs. For example financial reporting across a legacy system and Workday.
​
Statistical Methods
Traditional statistical techniques such as linear regression to analyze correlations between data in a dataset
AI/ML
Artificial Intelligence and Machine Learning algorithms used in data science

Data Architecture
Data Warehouse
A database that organizes data for reporting for multiple connected data domains.
​
Data Mart
A single domain data warehouse
​
Data Vault
A method of organizing data for analysis which makes maintaining the data easy
​
Data Lake
A repository of data in different formats - structured, unstructured, files, images, etc.
​
Unstructured Repositories
Storage for large volume (big data) and unstructured data.
​
Data Science
Analytical techniques that support predictive analysis based on statistical, and AI/ML algorithms.

Analysis Platforms
Relational Databases
Oracle, SQLServer, MySQL are popular relational platforms suitable for data warehouses, data marts, data vaults and for operational reporting, cross-system analytics, and cross-era analytics.
​
NoSQL Databases
Databases that store data in columnar format instead of row format. Postgres and Amazon Redshift are most often used.
​
Database as a Service
Databases on the cloud that are specifically geared for analytics solutions, including Snowflake
​
Big Data Platforms
Amazon S3, Hadoop are widely used for unstructured data and data lakes.
​
Files
Data science techniques often use files to store, connect, and manage data

Data Integration
Industrial-Strength Tier
Informatica and Talend scale on-premise and on-cloud. Extensive features and most costly. Can handle structured and unstructured data
​
Wide Usage Tier
Used most widely, lower cost tools include Microsoft Integration Services, IBM's DataStage
​
Free Editions
​
​
​
​