Google Drive¶
Connect Google Drive to Dxtra to automatically scan documents for personal data (PII). Dxtra identifies emails, phone numbers, government IDs, financial data, and other personal identifiers across your files.
What Gets Scanned¶
The PII scanner detects personal data in 50+ file formats:
| Category | File Types |
|---|---|
| Documents | PDF, DOCX, DOC, TXT, RTF |
| Spreadsheets | XLSX, XLS, CSV |
| Presentations | PPTX, PPT |
| Google Workspace | Google Docs, Sheets, Slides (exported for scanning) |
| Images | PNG, JPG, TIFF (via OCR) |
What Dxtra stores
Dxtra stores only metadata about detected PII (entity types, counts, confidence scores). The actual file content and personal data are never stored in Dxtra.
PII Types Detected¶
The scanner identifies these personal data categories:
| Category | Identifiers |
|---|---|
| Contact | Email addresses, phone numbers, person names |
| Government ID | SSN, passport numbers, driver's license numbers |
| Financial | Credit card numbers, bank account numbers, IBAN codes |
| Healthcare | Medical license numbers, NHS numbers |
| Location | Physical addresses, coordinates |
| Digital | Cryptocurrency addresses |
Each detection includes a confidence score indicating how likely the match is accurate.
Prerequisites¶
Before connecting Google Drive:
- Google Workspace admin access -- You need permissions to authorize OAuth applications
- Dxtra account with admin access -- Required to configure integrations
- Data controller setup complete -- Your Dxtra account must have a configured data controller
Setup¶
Step 1: Connect via OAuth¶
- In the Dxtra dashboard, go to Processors
- Select Google Drive from the available integrations
- Click Connect to start the OAuth authorization flow
- Sign in to your Google account and authorize Dxtra to access your Drive files
- You are redirected back to Dxtra with the connection confirmed
Step 2: Run Initial Scan¶
After connecting, Dxtra performs a full scan of your Google Drive:
- Files are discovered and classified by type
- Scannable files are queued for PII detection
- Each file is scanned individually (up to 40 files processed concurrently)
- Results appear in Data Mapping & Profiling in the dashboard
Step 3: Review Results¶
- Go to Data Mapping & Profiling in the dashboard sidebar
- View the summary: total files scanned, PII detected, sensitive PII, confidence scores
- Filter results by data type, file name, processor, file type, or confidence threshold
- Review individual file results for detailed PII findings
Incremental Scanning¶
After the initial full scan, Dxtra uses Google Drive's Changes API to detect new and modified files. Only changed files are rescanned, making incremental scans much faster.
File Filtering¶
Not all files are scanned. The scanner automatically skips:
- Video files (MP4, AVI, etc.)
- Audio files (MP3, WAV, etc.)
- Archive files (ZIP, RAR, 7z)
- Binary executables
- Google Forms, Scripts, Sites, and Maps
Disconnecting¶
To disconnect Google Drive, go to Processors in the dashboard and remove the Google Drive integration. This stops all scanning and removes the OAuth connection.
Troubleshooting¶
| Issue | Solution |
|---|---|
| OAuth connection fails | Ensure you have admin permissions on the Google Workspace account |
| No scan results | Check that files exist in the connected Drive and are in supported formats |
| Low confidence scores | Confidence varies by file type -- OCR of images typically produces lower confidence than text documents |
| Scan not completing | Large drives may take time. Check Data Mapping & Profiling for partial results |