Skip to content
Last updated: 2026-04-06
Guide

Data Connectors & Privacy Activity Logging

Automated privacy-preserving data processing activity logging for Google Drive and other cloud storage platforms with built-in compliance monitoring and data subject rights management.

Integration Documentation

Setup: Integration Overview • API Authentication • Webhook Configuration

Available Connectors: Shopify • Stripe • Mailchimp

Setup Guide API Documentation Troubleshooting

Key Capabilities

  • Privacy-preserving activity logging across cloud storage platforms
  • Compliance audit trails with GDPR Article 30 processing activity records
  • Real-time monitoring with privacy risk assessment and alert notifications
  • Data subject rights fulfillment with automated activity discovery and logging

Privacy-First Data Discovery Ecosystem

Enterprise-grade data processor platform providing comprehensive privacy compliance logging and audit trail management:

  • Google Drive Privacy Logging


    Privacy-preserving Google Drive activity logging with automated compliance audit trails for data processing transparency

    Key Capabilities: - Automated file discovery activity logging (no file content stored) - Privacy-preserving metadata collection for compliance auditing - Change detection and monitoring for processing activity transparency - Integration with Dxtra compliance system for audit trails - Support for GDPR Article 30 processing activity requirements

    Google Drive Integration

  • Privacy-Preserving PII Detection


    Privacy-first personal data detection using Microsoft Presidio analyzer with metadata-only logging for compliance audit trails

    Detection Capabilities: - High accuracy across multiple personal data categories - Privacy-preserving activity logging (no actual PII content stored) - Contextual analysis with confidence scoring for audit purposes - Extensible pattern recognition for custom data types - Built-in support for GDPR and privacy regulation audit requirements

    PII Detection Engine

  • :material-shopify: E-commerce Platform Integration


    Deep e-commerce integration with specialized connectors for Shopify, WooCommerce, Magento, and other leading platforms

    E-commerce Capabilities: - Complete customer journey and behavioral data processing activity logging - Automated order history and payment data privacy compliance audit trails - Product review and rating data management with GDPR processing activity records - Marketing automation and email campaign privacy activity integration - Multi-store and multi-currency privacy compliance processing activity management

    E-commerce Integration

  • Custom Integration Framework


    Flexible privacy activity logging platform enabling compliance audit trail integration with any data source

    Framework Benefits: - Pre-built processing activity templates for rapid compliance integration - SDK support for Python, Node.js, Java, and .NET - Real-time activity logging and batch processing capabilities - Built-in privacy compliance validation and audit trail generation - Enterprise-grade security and access control for processing activities

    Custom Framework

Privacy Activity Architecture

flowchart TD
    A[Data Processors] --> B[Dxtra Activity Loggers]
    B --> C[Privacy Activity Detection]
    C --> D[Processing Activity Logging]
    D --> E[Compliance Audit Trails]
    D --> F[Data Subject Rights Support]

    subgraph "Data Processors"
        G[Google Drive]
        H[File Systems] 
        I[Shopify Store]
        J[Third-party APIs]
    end

    subgraph "Privacy Logging"
        K[Activity Detection]
        L[Compliance Assessment]
        M[Article 30 Records]
    end

Privacy-Preserving PII Detection

Dxtra's privacy-first PII detection system creates processing activity logs across various file types without storing actual content:

Privacy-Preserving Approach

  • No content storage: Only processing activity metadata is logged
  • No PII content: Actual personal data is never stored or transmitted
  • Audit trails only: Creates compliance records for data processing activities
  • GDPR Article 30: Automatic processing activity record generation

Privacy-First Detection Engine

Compliance-focused processing activity logging with privacy-by-design principles:

Privacy-preserving document processing activity logging without storing actual document content or personal data.

Processing Activity Logging:

Python
# Privacy-preserving activity logging (no content stored)
class PrivacyActivityClient:
    async def log_pii_scan_activity(
        self,
        data_controller_id: str,
        file_identifier: str,  # Hashed identifier, not filename
        pii_metadata: PiiDetectionMetadata  # Aggregate metadata only
    ) -> Optional[DataProcessingActivity]:
        """
        Log PII scanning activity in privacy-preserving way.

        Creates data_processing_activities record with:
        - Data controller ID (tenant)
        - Source ID (Google Drive = 2)
        - Type ID (PII Scan = 2)
        - Field IDs (mapped from entity types found)
        - No actual PII content or file content
        """

What Gets Logged (Privacy-Preserving):

JSON
{
  "data_processing_activity": {
    "data_subject_id": "uuid-hash-of-file-identifier",
    "source_id": 2,
    "type_id": 2,
    "field_ids": [1, 2, 5],  // Mapped from entity types found
    "triggered_at": "2024-01-15T10:30:00Z",
    "metadata": "Only aggregate counts and types, no actual content"
  }
}

Compliance audit trail generation for personal data processing activities without exposing sensitive information:

Processing Activity Type Field ID Mapping Privacy Protection
Contact Information Processing Field IDs 1, 2, 3 No actual contact info stored
Financial Data Processing Field IDs 4, 5, 6 No financial data stored
Health Information Processing Field IDs 7, 8, 9 No health data stored
Government ID Processing Field IDs 10, 11, 12 No ID numbers stored

Privacy-Preserving Implementation:

Python
def _map_pii_types_to_field_ids(self, entity_types: List[str]) -> List[int]:
    """Map PII entity types to field IDs (privacy-preserving)."""
    field_mapping = {
        "EMAIL": 1,      # Email processing activity
        "PHONE": 2,      # Phone processing activity  
        "SSN": 3,        # SSN processing activity
        "CREDIT_CARD": 4 # Financial processing activity
    }
    # Returns field IDs for processing activity logging only
    return [field_mapping.get(entity_type, 99) for entity_type in entity_types]

Google Drive Privacy Logging

Privacy-first Google Drive integration that creates compliance audit trails without storing file content or personal data.

Privacy-Preserving File Discovery

Privacy-by-Design Setup

  • AWS Secrets Manager for secure OAuth credential storage
  • SSM Parameter Store for change detection tokens
  • No file content or personal data storage
  • Aggregate activity logging only

Privacy-Compliant Discovery Process:

  • Processing Activity Logging


    Aggregate activity logging without storing file content or personal information

    What Gets Logged: - Number of files discovered (aggregate count) - MIME types found (categories, not actual files) - Processing activity timestamp - Data controller association - What's NOT logged: File names, content, personal data

    Privacy Protection: - File discovery creates single processing activity record - No individual file records stored - No access to actual file content - Compliance audit trail only

  • Secure Credential Management


    Enterprise-grade security with OAuth 2.0 and AWS-native credential management

    Security Features: - AWS Secrets Manager for OAuth tokens - SSM Parameter Store for change detection - No credential exposure in logs or databases - Automatic token refresh and rotation - Comprehensive audit trails for access patterns

    Compliance Integration: - Automated GDPR Article 30 activity record generation - Data processor registration in compliance system - Processing activity audit trails - No personal data collection or storage

  • Privacy-Compliant Monitoring


    Audit trail monitoring with privacy risk assessment for compliance reporting

    Monitoring Capabilities: - Processing activity frequency tracking - Data controller activity oversight - Compliance audit trail generation - Privacy risk assessment based on activity patterns

    Analytics & Reporting: - Processing activity compliance dashboards - Automated regulatory compliance reports for activities - Data processing inventory management - Audit trail verification and validation

Privacy-First Implementation

Enterprise Google Drive setup focused on privacy compliance and audit trail generation.

Prerequisites:

Bash
# AWS infrastructure for secure credential management
# Dxtra enterprise subscription with compliance features
# Google Cloud Platform OAuth application

Step 1: Secure Credential Configuration

JSON
{
  "aws_secrets_manager": {
    "google_drive_credentials": {
      "client_id": "your-oauth-client-id",
      "client_secret": "your-oauth-client-secret", 
      "refresh_token": "oauth-refresh-token"
    }
  },
  "privacy_settings": {
    "content_access": false,
    "metadata_only": true,
    "audit_trail_generation": true,
    "compliance_logging": "gdpr_article_30"
  }
}

Step 2: Processing Activity Configuration

YAML
google_drive_processing_activity:
  data_processor:
    name: "Google Drive Discovery"
    source_id: 2
    type_id: 3  # File discovery activity

  privacy_protection:
    no_content_access: true
    no_personal_data_storage: true
    aggregate_logging_only: true

  compliance_settings:
    article_30_compliance: true
    audit_trail_generation: true
    data_controller_association: true

AWS Lambda-based privacy detection that creates audit trails without accessing or storing sensitive data.

Lambda Architecture:

Python
# Google Drive Discovery Lambda (privacy-preserving)
async def lambda_handler_async(event, context):
    """
    Privacy-preserving Google Drive file discovery.

    Creates processing activity audit trails without storing:
    - File content
    - File names
    - Personal data
    - Individual file records
    """
    data_controller_id = event.get("tenantId")

    # Discover files (metadata only for counting)
    files = discover_files_from_drive()  # Count only

    # Log privacy-preserving activity
    if files:
        activity_logged = await _log_file_discovery_activity(
            data_controller_id, 
            files,  # Used for aggregate counting only
            full_scan_requested
        )

    # Return aggregate data for Step Functions
    return {
        "statusCode": 200,
        "body": json.dumps({
            "tenantId": data_controller_id,
            "files": files,  # For Step Functions processing only
            "privacy_activity_logged": activity_logged
        })
    }

Privacy-Preserving Activity Logging:

Python
async def _log_file_discovery_activity(
    data_controller_id: str, 
    files: List[DriveFile], 
    full_scan: bool = False
) -> bool:
    """
    Log file discovery activity (privacy-preserving).

    Creates single data_processing_activities record with:
    - Aggregate file count (not individual files)
    - MIME type categories (not specific files)
    - Processing timestamp
    - Data controller association
    """
    discovery_metadata = FileDiscoveryMetadata(
        files_discovered_count=len(files),
        file_types_discovered=[...],  # Categories only
        full_scan=full_scan
    )

    # Single processing activity record (privacy-preserving)
    result = await client.log_file_discovery_activity(
        data_controller_id, 
        discovery_metadata
    )

Privacy-Compliant Workflow

Lambda-Based Processing

  • Google Drive Discovery Lambda: Creates aggregate processing activity records
  • PII Scanner Lambda: Logs PII detection activities (no content storage)
  • Step Functions Orchestration: Coordinates privacy-compliant processing
  • AWS-Native Security: Secrets Manager + SSM Parameter Store

Processing Activity Creation

When files are discovered, the system automatically: 1. Creates aggregate data processing activity record in Dxtra 2. Associates activity with appropriate data controller 3. Logs processing type and source (no file content) 4. Generates compliance audit trail for Article 30 requirements

PII Scanner Privacy Logging

Privacy-First Detection Methods

The PII scanner creates processing activity audit trails without storing actual PII content:

  • Pattern detection for compliance activity logging
  • Aggregate metadata collection (counts and types only)
  • Privacy-preserving processing with no content storage
  • Compliance audit trails for GDPR Article 30 requirements

Processing Activity Categories

Activity Category Field ID Privacy Protection
Contact Data Processing Field IDs 1-3 No contact info stored
Financial Data Processing Field IDs 4-6 No financial data stored
Health Data Processing Field IDs 7-9 No health data stored
Identity Data Processing Field IDs 10-12 No identity data stored

Privacy-Preserving Implementation

Python
# PII Scanner Lambda (privacy-preserving)
async def log_pii_scan_activity(
    data_controller_id: str,
    file: DriveFile,
    entities: List[Entity],
    scan_duration_ms: Optional[int] = None,
) -> bool:
    """
    Log PII scanning activity in privacy-preserving way.

    Creates data_processing_activities record with only:
    - Entity types found (not locations or content)
    - Entity counts (not actual entities)
    - Confidence scores (aggregate)
    - Processing timestamp
    """
    # Privacy-preserving metadata (no actual PII)
    entity_types = list(set(entity.entity_type for entity in entities))

    pii_metadata = PiiDetectionMetadata(
        entity_types_found=entity_types,  # Types only
        entity_count=len(entities),       # Count only
        confidence_scores=[...],          # Aggregate scores
        scan_duration_ms=scan_duration_ms # Performance only
    )

    # Create processing activity record (privacy-preserving)
    result = await client.log_pii_scan_activity(
        data_controller_id=data_controller_id,
        file_identifier=hash(f"{file.id}:{file.name}"),  # Hash only
        pii_metadata=pii_metadata  # Aggregate metadata only
    )

Lambda Architecture

Event-Driven Processing:

flowchart LR
    A[Step Functions] --> B[PII Scanner Lambda]
    B --> C[Presidio Analysis]
    C --> D[Privacy Activity Client]
    D --> E[Hasura GraphQL]
    E --> F[data_processing_activities]

Privacy-Preserving Flow: 1. File Processing: Lambda downloads and scans file temporarily 2. PII Detection: Presidio analyzes content (in memory only) 3. Activity Logging: Creates processing activity record (no content stored) 4. Cleanup: All file content and PII data discarded 5. Audit Trail: Only processing activity metadata retained

Custom Integration Framework

Privacy-Compliant Connector Architecture

Base Privacy Activity Connector

Python
from privacy_activity_client import PrivacyActivityClient

class CustomPrivacyConnector:
    def __init__(self, config):
        self.client = PrivacyActivityClient(config.hasura_config)
        self.data_controller_id = config.data_controller_id

    async def log_processing_activity(self, activity_type, metadata):
        """Log processing activity without storing personal data"""
        return await self.client.log_processing_activity(
            data_controller_id=self.data_controller_id,
            activity_type=activity_type,
            privacy_metadata=metadata  # Aggregate metadata only
        )

Privacy-First Configuration

YAML
connector_config:
  name: "Custom Privacy Connector"
  version: "1.0.0"
  privacy_settings:
    content_access: false
    metadata_only: true
    processing_activity_logging: true
    compliance_mode: "gdpr_article_30"
  data_processor:
    source_id: 10  # Custom connector source ID
    type_ids: [20, 21, 22]  # Custom activity types

Monitoring and Compliance

Processing Activity Dashboard

Real-Time Compliance Monitoring

JSON
{
  "processing_activities": {
    "google_drive_discovery": {
      "status": "active",
      "last_activity": "2025-08-07T10:30:00Z", 
      "activities_logged": 15420,
      "data_controllers": 50,
      "compliance_status": "gdpr_compliant"
    },
    "pii_detection": {
      "status": "active",
      "activities_logged": 2847,
      "privacy_compliance": "full",
      "content_storage": "none"
    }
  }
}

Compliance Metrics

  • Processing Activities Logged: Total audit trail entries
  • Privacy Compliance: Full privacy-by-design implementation
  • Content Storage: Zero content or personal data storage
  • Audit Trail Coverage: Complete GDPR Article 30 compliance

API Reference

Processing Activity API

Lambda Functions (Internal)

YAML
# These are internal AWS Lambda functions, not public APIs
lambda_functions:
  google_drive_discovery:
    handler: "fetch_lambda.lambda_handler"
    privacy_mode: "audit_trail_only"

  pii_scanner:
    handler: "scanner_lambda.lambda_handler_sync"
    privacy_mode: "metadata_only"

GraphQL Integration

GraphQL
# Processing activities are logged via existing Hasura schema
mutation LogProcessingActivity($object: dataProcessingActivities_insert_input!) {
  insertDataProcessingActivity(object: $object) {
    id
    createdAt
    dataSubjectId
    sourceId
    typeId
    fieldIds
  }
}

Best Practices

Privacy Implementation Guidelines

  1. Privacy by Design:
  2. Never store actual file content or personal data
  3. Log only aggregate processing activities
  4. Compliance First:
  5. Focus on GDPR Article 30 audit trail requirements
  6. Generate processing activity records for transparency
  7. Security:
  8. Use AWS-native credential management
  9. Implement proper access controls and audit logging

Data Protection Principles

  1. Data Minimization:
  2. Collect only processing activity metadata necessary for compliance
  3. Purpose Limitation:
  4. Use data only for compliance audit trail generation
  5. Storage Limitation:
  6. No personal data or content storage
  7. Privacy by Default:
  8. All connectors implement privacy-preserving design

Support and Resources

Getting Help

  • Dxtra Support: connectors-support@dxtra.io
  • Privacy Documentation: Focus on compliance and audit trail generation
  • Developer Integration: Privacy-first connector development guides

Last updated: 2025-08-12
Privacy-preserving implementation with GDPR Article 30 compliance
All connectors implement privacy-by-design with zero content storage