The U.S. Army and DoD Chief Digital and Artificial Intelligence Office (CDAO) Accelerate the Speed and Efficiency of Data with Two Data Replication Pipelines 

By Denise Kovalevich, OEMSeptember 20, 2023

(Photo Credit: U.S. Army) VIEW ORIGINAL

The U.S. Army and the Department of Defense's Chief Digital and Artificial Intelligence Office (CDAO) have unveiled significant advances in information infrastructure that will increase frequency of data updates, while simultaneously ensuring accuracy and improving accountability. The two organizations revealed details for two real-time, streaming, data replication pipelines from two of the Army’s major Enterprise Reporting Programs (ERPs) – the Army General Fund Enterprise Business System (GFEBS) and Global Combat Support System (GCSS-Army) – into the DoD’s centralized data and analytics platforms known as ADVANA.

The Army Business Mission Area (BMA) Financial and Logistic Data Stewards worked tirelessly with the CDAO team on this initiative. According to the Finance Data Steward, Chase Levinson, “These newly implemented data pipelines will enable Army to take in large volumes of raw data from our legacy ERPs and move that data to a consolidated platform for more timely analysis. We will increase the ingestion of our key data sets from once monthly to real-time.”

Data replication pipelines are essential to efficiently ingesting information in real-time from one source to one or more target systems. These pipelines also play a crucial role in data integration, analytics, and intelligence workflows, which will have significant benefits for the Office of the Secretary of Defense (OSD).

“With the success of the Army BMA and OSD CDAO team partnering to establish real-time streaming data replication pipelines in milliseconds from Army ERP data sources into ADVANA, it empowers decision makers with the ability to harness the full potential of their data, gain a competitive analytic edge, and drive  innovation in the areas of advanced analytics and machine learning,” said Mr. Bakari P. Dale, Army Business Mission Area Data Officer (MADO).

Advantages of the Data Replication Pipelines

  • Timely and Accurate Insights: The real-time streaming replication allows OSD to capture and process Army BMA data as it is generated, providing near-instantaneous access to the latest information. This enables faster decision-making and the ability to respond quickly to changing conditions or emerging opportunities.   
  • Improved Data Quality: By replicating Army BMA data in real-time, OSD CDAO can ensure that ADVANA is always up-to-date with the most recent data for all of its Army and DoD users. This reduces the risk of using outdated or incomplete information, leading to more accurate and reliable insights. This is in line with the OSD Data Strategy and Army Data Plan, which requires the Army to prioritize, mature, and scale ongoing data management efforts.   
  • Enhanced Operational Efficiency: Real-time replication eliminates the need for manual data extraction, transformation, and loading (ETL) processes, the most time-consuming and resource-intensive portion of data ingestion. By automating the data replication pipeline, the ADVANA team can streamline data integration workflows.   
  • Smoother Data Integration: Real-time replication pipelines will allow for integrated data from various sources, such as databases, applications, sensors, or external Application Programming Interfaces (API)s. This enables OSD CDAO to consolidate data from different systems into ADVANA for better alignment.  
  • Scalability and Flexibility: Real-time replication pipelines can handle large volumes of data at scale to accommodate increasing data loads more effectively than manual loads. They are designed to be flexible and adaptable, allowing the CDAO team to add new data sources or modify existing ones as their needs evolve. This will allow more organizations to share the same data.   
  • Real-Time Monitoring and Alerting: Moving forward, Army will use ARES/ADVANA to monitor data quality and will leverage the OSD to help identify issues or anomalies as they occur. This enables both the Army and OSD to work together to proactively troubleshoot and set up real-time alerts for data inconsistencies.  
  • Advanced Analytics and Machine Learning: Real-time BMA data replication enables OSD CDAO to leverage advanced analytics techniques, such as real-time dashboards, predictive modeling, and machine learning algorithms. By having access to up-to-date data, OSD can derive insights in its massive suite of dashboards to help make data-driven decisions in real-time that directly impact Soldiers in the field.   
“We are seeing transactions for financials and log transactions in milliseconds from when they post in the Army source system,” said Mr. Nicholas Lanham, Chief Digital and Artificial Intelligence Office (CDAO), Directorate for Business Analytics (BA), and Division Chief for Enterprise Platforms and Capabilities. “We are also now able to set a standard for other services to iterate towards for real-time inventory/ log/ transaction reporting. Additionally, this has changed the game for our Army teammates. Over the past several months, we engineered a sophisticated data transfer architecture that maximizes the use of the existing Army SAP/SLT services and created real-time feeds into ADVANA/ MySQL/S3, and we will soon be federating this data to a new data layer in ADVANA for SAP HANA. This will touch dozens of existing systems and update all their data through the pipelines as well.”

What is Possible? 

With only three to four months of efforts partnered with ARES/ADVANA and Army engineering teams, imagine what the future could hold for other Army weapon system platforms, cross-domain feeds, or federating approved-Army data using this new data layer leveraging ARES/ADVANA?

According to both Mr. Dale and Mr. Lanham, Army users can use the rest of the ADVANA tech stack to build/explore/innovate new models and capabilities. They could even build upon their knowledge of Databricks, ML Flow, DataRobot, C3 AI, Qlik, Tableau, SageMaker, Perceptor, Collibra, APIgee, Gitlab, and more.

“We are excited for the future of Army data integration possibilities now that even more Army data is real-time replicating in ARES,” said Lanham.