Implementing Contextual Access Management for sensitive data in Snowflake

Protect your most sensitive Snowflake data with human-readable policies and business context from Systems of Record

Marc Jordan, VP of Product, SGNL

July 8, 2023

Access Control for modern data platforms

For the modern enterprise, data and the insights drawn from that data are their crown jewels. If you’ve worked with data or worked at a large enterprise in the last few years - you’ve probably heard about this data revolution and with it, the phenomenal wave of platforms that have democratized data.

Snowflake, Databricks, Looker, and Tableau are now household names in the Enterprise, enabling Data Scientists, Engineers, Customer Support/Success, Product, and others to deeply engage with and understand data about their business.

While we’ll likely continue to yearn for more data and insights, the centralization and democratization of data has presented a new set of challenges. How do you control all of that data? More users than ever before have more access to data than they have ever had and organizations are racing to catch up to comply with new and existing regulation.

Data Platforms themselves are even racing to catch up, attempting to balance the storage of more data than ever, with advanced analytic capabilities, with security and governance.

In this series of blog posts, we’ll discuss some of the limitations that exist in the modern data stack when it comes to security, governance, and privacy and discuss how SGNL can provide continuous evaluation of access to this data via human readable policies that leverage business context in real-time. We’ll also introduce you to one of SGNL’s partners, Cyral who provide a lightning-fast, reliable sidecar that deeply integrates with Data Platforms like Snowflake and provides support for OPA Policies to make local and/or centralized policy decisions with the SGNL Platform.

Let’s dive in and chat a little about Snowflake…

Limitations of Snowflake Access Control

As with almost any modern Data Platform or Service, Snowflake comes with a broad set of capabilities out of the box in order to specify local policy within the Snowflake Cloud. You can create individual users, grant roles and privileges and even support SSO and MFA into the platform.

These controls are adequate for providing coarse-grained, slowing moving access to non-sensitive data. But what about situations where someone needs just a little bit of access for a short amount of time, take for example a Customer Support user or Engineer, needing to support a customer for a short period while they resolve an issue. These slow-moving, coarse-grained permissions get overly cumbersome. We’ve seen first-hand though, that there are a range of challenges the modern enterprise must work through to provide the levels of fine-grained access that is needed:

Providing access to specific data (database, tables, rows, etc) based on context about the principal, their location, the device they’re on, or their current session is not possible with native controls
Visibility is limited to only a single instance, making auditing and controls across multiple instances, or multiple different systems challenging at best
Changing policy requires manual effort to set the appropriate roles and privileges to someone that needs it, and consequently this access is frequently not removed when access is no longer needed
Policy cannot be easily reused across multiple instances, or across different systems that have the same requirements

It’s no wonder then that organizations that are broadly adopting the Snowflake Data Cloud are looking for centralized, continuous, and contextually sensitive access management that gives the levels of visibility and control they need to remain in compliance with their regulatory needs and customer privacy commitments.

How Does Snowflake Work

Snowflake sits at the center of the modern data cloud, acting as a data lake/warehouse to satisfy the data storage needs of the enterprise and powering applications, insights, learning platforms to propel a business (and its partners) forward.

With the Snowflake Data Cloud, organizations can spend less time trying to find the right data, and more time analyzing, extracting, and sharing insights.

How can SGNL and Cyral Help?

There are many benefits for using SGNL and Cyral to secure Snowflake:

Context-driven, continuous data authorization: Changes in business, user, or data context are available in practically real-time, to be evaluated against the SGNL Policy Engine, enabling changes in context, role, or processes to immediately allow access to new data, or remove access to existing data
Simplified Policy Management: There’s no need to have to move across snowflake instances and change roles or permissions each time your business needs change - simply modify (or simulate) policies centrally, and apply them to Snowflake without ever needing to login to a Snowflake instance
Real-time audit and comprehensive reporting: Because all access is evaluated centrally, auditing and reporting is available from a single pane of glass, in real-time, across applications and services, enabling a unique perspective on the breadth of a user’s access, or the access that is happening in just a few apps or services
Centralized human readable policies: Human readable policies using reusable snippets enable organizations to scale policies without losing manageability.
Contextual authorization decisions: Based on physical location, time of day, device status and more, enabling access to change based on the context associated with a user’s individual request for access to data

How do SGNL and Cyral work together?

SGNL continuously ingests data from systems of record into a centralized graph directory via resilient and performant connectors. This includes identity data such as users and groups, but any relevant data required to define access policies, such as ITSM cases, or customers from CRM. Administrators can then quickly author and manage human-readable policies for granting or denying access to sensitive data, applications, or APIs. By implementing a centralized approach, SGNL provides consistency in centralized policy management, audit logging, and reporting across an organization’s most sensitive assets.

Cyral is deeply integrated with Data Platforms, deployed as a high-performance, low-latency, sidecar in the data path. Cyral policy is expressed via the Open Policy Agent (OPA) policy language, supporting extensibility into centralized authorization platforms such as SGNL (you can read more about SGNL+OPA here). With SGNL and Cyral, you’re able to deploy a lightweight sidecar component and unlock a breadth of data privacy capabilities such as data masking and obfuscation while reusing the same SGNL policies you use for your other applications and services.

With SGNL and Cyral, you can simply select policy snippets from the user interface and graphically build a human-readable policy. When a principal is attempting to access data inside of Snowflake, Cyral intercepts the request and has SGNL determine whether a given user is allowed to access the requested records, down to row and column level control.

Snowflake and SGNL Integration

The integration between SGNL, Cyral, and Snowflake is straightforward:

Scenario

This article uses an example of 2 Account Executives, attempting to get access to Customer data that is stored within Snowflake. In this example, we’ll use desk location data from an IdP (e.g. Azure AD), regional assignment data from Salesforce, and user context from a given user’s IP Address. Our two users are:

As it happens, both users are currently in Australia at a company-wide meeting.

In this example, we’ve configured:

An existing Snowflake Instance running in the AWS Cloud
A Cyral sidecar deployed in the same AWS region as Snowflake
A SGNL client that is pre-integrated with Azure AD and Salesforce as Systems of Record

Configuring the Protected System

To get started, we’re going to configure a new protected system from the SGNL Dashboard called Snowflake. We’ve given it an appropriate description and configured the request configuration to accept an Azure AD email address as the principalId, and a Customer Id as the assetId.

From the Authentication tab, issue a new authentication token and store it safely for a moment, this is what will be used for the sidecar to communicate with SGNL.

For demonstration purposes, we’ve assigned a very lenient policy on the SGNL side, allowing all access requests to be allowed:

Configuring the Cyral Sidecar

You’ll need to configure the Cyral Sidecar with the Authentication token we just issued, as well as configuration to talk to the SGNL Access Service when it receives a request. You can read more about configuring the Cyral Sidecar with Rego Policies here and read more about calling out to SGNL with OPA on our blog or help documentation.

Within the OPA Policy you create, you’ll use the http.send method of a POST to the SGNL Access Service, using an ‘Authorization’ header with the Bearer token you previously saved from SGNL. A sample Rego policy that you might use in Cyral would look like:

default block := false
url := "https://access.sgnlapis.cloud/access/v1/evaluations"
token := "<integrationToken>"
principal := input.activityLog.identity.endUser
ipaddress := input.activityLogs.client.host
block {
  response := http.send({
    "method": "POST",
    "url": url,
    "tls_use_system_certs": true,
    "headers": {
      "Authorization": sprintf("Bearer %s", [token]),
      "Content-type": "application/json",
    },
    "body": {
      "principal": {
				"id": principal
				"ipAddress": ipaddress
      },
      "queries": [
        {
          "assetId": <the type of asset>,
          "action": "access"
        }
      ]
    },
    "force_cache": false,
    "force_json_decode" : true
  })
  response.body.decisions[0].decision != "Allow"
}

policyDecision = {
  "requestViolation" : {
    "actions": {
      "alert": true,
      "block": block
    },
  "cause": "Blocking all requests",
  "severity": "low"
  }
}

Note from the support documentation, that you will have to create the Policy Template and then apply it to the Cyral Control Plane. When creating the policy, you’re easily able to identify the calling user via input.activityLog.identity.endUser and the ipAddress via input.activityLogs.client.host

Ensure that within the Cyral Control Plane, you have also enabled Policy Enforcement to Block on Violations.

Ready to Make Requests

At this point, we have the Cyral Sidecar sitting in front of our Snowflake Instance, and configured to call into the SGNL Protected System. As you might recall, we have a fairly permissive policy in place, allowing all access at the moment.

As Nancy, my US/Canada-based Account Executive accessing Snowsight, we can see that she is currently able to see all Customer Records inside of the Snowflake Database.

Unsurprisingly, John’s access looks identical, given that all users have access to all information.

It’s about time we give Nancy and John access to only the customers that they actually work with on a day-to-day basis, so we’re going to remove the All Access Policy and instead replace it with 2 new policies, the AU/NZ Customer Data Access Policy and the US/Canada Customer Data Access Policy:

The policies are fairly simple for illustrative purposes, you’re probably starting to see how powerful human-readable policies are that are composable from snippets:

These policies allow access based on the desk location and assignment of a user, as well as where the Customer is based.

Without any changes in Snowflake, or any requirement to log out and back in, we can immediately see the result if we have Nancy run her query again:

With the exact same query, Nancy is now only able to see customer data for those customers she is responsible for. Similarly for John:

But let’s say this data has special residency or sovereignty requirements and we only allow access to it within its country of origin. We can apply another policy based on context about the principal’s current location:

We’re adding an additional ‘Deny’ Policy to the Snowflake Protected System here. With this Policy, the SGNL Platform makes a determination of the current country a principal is physically located in, based on their IP Address. The logic behind this policy looks to determine the country of the customer, and the physical location of the user. If these are different, access to that specific customer record should be denied.

We can see that John, who is based in Melbourne, Australia, can now only see Australian Customer records, and no longer those in New Zealand:

As you’ve seen, Nancy is assigned to, and is only allowed access to US and Canadian customers, but in this example, she has traveled to Australia to meet with John. Based on the new policy we’ve assigned, the country she is physically located in (Australia), will not match any of the accounts that she actively works on, as a result - we expect that she won’t have access to see any data.

With this capability, we’ve now prevented sensitive PII from being able to be accessed outside of its allowed location. While this is a simple example, SGNL and Cyral customers are able to adhere to data sovereignty and residency requirements, ensuring that sensitive data cannot move across geographic boundaries.

Conclusion

In this short guide, we’ve walked you through how SGNL and Cyral can seamlessly and easily work together to continuously and contextually allow access to the appropriate data inside of Snowflake. We’ve really just scratched the surface of the capabilities available across these two platforms, from field-level masking, to data obfuscation, substitution, and more. This is all done dynamically, based on critical business context from organizational Systems of Record. What’s better is that the policies we’ve created today can be re-used across any of the other Protected Systems that an organization has, from Snowflake, to Salesforce, to Homegrown Apps and beyond. If you’d like to learn more please reach out or get in touch for a demo of our capabilities.