Data engineers: Key role in data privacy and protection

Data engineer considers data privacy and protection
Read the results of a SAS survey about data privacy.

As a career data geek, I enjoy watching how the growing pervasiveness and popularity of data is reshaping industries and mainstream culture. A related trend is the increasing number of jobs that include data in their title. One that's becoming almost as prevalent as data scientist these days is data engineer. Searching a few of the major job posting websites, I discovered an expected amount of variability in how the role of data engineer was defined. But the most common job responsibilities and skills included:

  • Identify and evaluate data sources, both internal and external to the enterprise.
  • Design and model data infrastructure solutions for both relational and noSQL data structures.
  • Build and maintain extract-transform-load (ETL), data integration and data quality processes.
  • Document requirements, data lineage and subject matter in both business and technical terminology.
  • Have strong programming skills in various languages (Python and Java were most commonly cited).
  • Be proficient with big data technologies, such as Hadoop, MapReduce, Hive, Pig and Apache Spark.

It seems to me that data engineer has become a catch-all term for data-related responsibilities not assigned to data scientists and data analysts. And it encapsulates aspects of job titles I heard for decades – like data modeler, database administrator (DBA), data steward and ETL developer.

Data engineers: Implementing data protection

Another recurring aspect of many job postings relates to data privacy and the role data engineers play in protecting sensitive data. This is especially relevant in light of regulatory compliance frameworks such as the European Union’s General Data Protection Regulation (GDPR) and the California Consumer Privacy Act (CCPA). To address these and other data privacy initiatives, organizations need clearly defined data privacy policies that detail how to identify sensitive personal data that's subject to protection.

Organizations can reduce the risk of unauthorized access to sensitive data by:

Most of the responsibility for implementing data protection falls under the purview of what are now called data engineers.

Learn about SAS for Personal Data Protection

About Author

Jim Harris

Blogger-in-Chief at Obsessive-Compulsive Data Quality (OCDQ)

Jim Harris is a recognized data quality thought leader with 25 years of enterprise data management industry experience. Jim is an independent consultant, speaker, and freelance writer. Jim is the Blogger-in-Chief at Obsessive-Compulsive Data Quality, an independent blog offering a vendor-neutral perspective on data quality and its related disciplines, including data governance, master data management, and business intelligence.

Related Posts

Leave A Reply

Back to Top