Skip to content

Google Drive

Connect Google Drive to Dxtra to automatically scan documents for personal data (PII). Dxtra identifies emails, phone numbers, government IDs, financial data, and other personal identifiers across your files.

What Gets Scanned

The PII scanner detects personal data in 50+ file formats:

Category File Types
Documents PDF, DOCX, DOC, TXT, RTF
Spreadsheets XLSX, XLS, CSV
Presentations PPTX, PPT
Google Workspace Google Docs, Sheets, Slides (exported for scanning)
Images PNG, JPG, TIFF (via OCR)

What Dxtra stores

Dxtra stores only metadata about detected PII (entity types, counts, confidence scores). The actual file content and personal data are never stored in Dxtra.

PII Types Detected

The scanner identifies these personal data categories:

Category Identifiers
Contact Email addresses, phone numbers, person names
Government ID SSN, passport numbers, driver's license numbers
Financial Credit card numbers, bank account numbers, IBAN codes
Healthcare Medical license numbers, NHS numbers
Location Physical addresses, coordinates
Digital Cryptocurrency addresses

Each detection includes a confidence score indicating how likely the match is accurate.

Prerequisites

Before connecting Google Drive:

  1. Google Workspace admin access -- You need permissions to authorize OAuth applications
  2. Dxtra account with admin access -- Required to configure integrations
  3. Data controller setup complete -- Your Dxtra account must have a configured data controller

Setup

Step 1: Connect via OAuth

  1. In the Dxtra dashboard, go to Processors
  2. Select Google Drive from the available integrations
  3. Click Connect to start the OAuth authorization flow
  4. Sign in to your Google account and authorize Dxtra to access your Drive files
  5. You are redirected back to Dxtra with the connection confirmed

Step 2: Run Initial Scan

After connecting, Dxtra performs a full scan of your Google Drive:

  1. Files are discovered and classified by type
  2. Scannable files are queued for PII detection
  3. Each file is scanned individually (up to 40 files processed concurrently)
  4. Results appear in Data Mapping & Profiling in the dashboard

Step 3: Review Results

  1. Go to Data Mapping & Profiling in the dashboard sidebar
  2. View the summary: total files scanned, PII detected, sensitive PII, confidence scores
  3. Filter results by data type, file name, processor, file type, or confidence threshold
  4. Review individual file results for detailed PII findings

Incremental Scanning

After the initial full scan, Dxtra uses Google Drive's Changes API to detect new and modified files. Only changed files are rescanned, making incremental scans much faster.

File Filtering

Not all files are scanned. The scanner automatically skips:

  • Video files (MP4, AVI, etc.)
  • Audio files (MP3, WAV, etc.)
  • Archive files (ZIP, RAR, 7z)
  • Binary executables
  • Google Forms, Scripts, Sites, and Maps

Disconnecting

To disconnect Google Drive, go to Processors in the dashboard and remove the Google Drive integration. This stops all scanning and removes the OAuth connection.

Troubleshooting

Issue Solution
OAuth connection fails Ensure you have admin permissions on the Google Workspace account
No scan results Check that files exist in the connected Drive and are in supported formats
Low confidence scores Confidence varies by file type -- OCR of images typically produces lower confidence than text documents
Scan not completing Large drives may take time. Check Data Mapping & Profiling for partial results