Data Engineering and Data Governance
/Just a short blog this week, but one that addresses the alignment between data governance and other data roles. Of course, I am biased, but I have always believed that data governance can help the other data management disciplines be more successful. The rising prominence of data engineer roles across organisations presents an interesting question: how do these technical specialists integrate within an established data governance framework? The answer lies in recognising data engineers as data custodians rather than creating a separate data governance role.
Understanding the Data Engineer Role
A data engineer is not a data governance role; it is a fundamentally technical role that exists whether or not you have data governance in place. Data engineers function as technical intermediaries who source, transform, and prepare data for analytical consumption. Their responsibilities span several critical dimensions:
Data Pipeline Architecture: Designing and building automated systems that move data from source systems to analytical platforms. This includes creating robust ETL (Extract, Transform, Load) processes that handle varying data volumes and formats whilst maintaining reliability.
Technical Transformation: Converting raw data into standardised, analysis-ready formats (including the increasingly popular data products). Data engineers apply business rules, handle data type conversions, and implement transformation logic that aligns technical outputs with analytical requirements.
System Integration: Connecting disparate data sources across the organisation. This involves working with APIs, databases, cloud platforms, and legacy systems to create unified data flows that support comprehensive analytics capabilities.
Quality Assurance: Implementing checks throughout data pipelines. Data engineers build monitoring systems that detect anomalies, track data lineage, and ensure data quality.
Performance Optimisation: Ensuring data processing operates efficiently at scale.
Infrastructure Management: Maintaining the technical environment that supports data operations
So you can see that a data engineer is a technical role, distinct from formal data governance roles. However, within the data governance structure, data engineers align naturally with the data custodian role through their operational responsibilities. Whilst most data governance roles are predominantly fulfilled by business stakeholders who understand data context and requirements, the data custodian role is often fulfilled by IT professionals
The data custodian role works well as data engineers are not the business stakeholders accountable for the data itself. Data engineers maintain and transform data according to business requirements without assuming ownership responsibilities. They are responsible for liaising with Data Owners to obtain permission for use of their data, ensuring appropriate authorisation throughout the data lifecycle.
Successful integration requires positioning data engineers within your existing data governance framework rather than treating them as separate entities. You need to make sure that their activities are aligned with your data governance framework. Agreeing that they are data custodians provides flexibility to accommodate the technical nature of data engineering work whilst ensuring alignment with your data governance objectives.
Effective data governance succeeds through role clarity and simplicity. Data engineers contribute most effectively when their technical capabilities operate within your established Data governance framework and considering them as data custodians is a simple and effective way to achieve this.
If you want a deeper insight into the data custodian role, I've previously explored this topic in this blog post.
If you'd like support aligning data engineer roles with your data governance framework, book a call with me using the button below.