DML-ish Operations in the Visualforce Controller Constructor

DML is not allowed in a Visualforce page controller’s constructor. This isn’t news.

However, a colleague and I discovered that the behavior with various DML-equivalent (“DML-ish”) code is inconsistent and rather strange.

Here’s a minimal Visualforce page:

<apex:page controller="TestBatchFiringController">
</apex:page>

and its corresponding controller:

public class TestBatchFiringController {
    public TestBatchFiringController() {
        // These statements successfully completed.

        // Results in a valid AsyncApexJob id, but batch does not execute.
        // AsyncApexJob Id is not valid after method executes (but it is in test context!)
        System.debug('My job id is: ' + Database.executeBatch(new TestBatchFiringBatch(), 1));

        // Results in a valid CronTrigger id, but schedulable does not execute.
        // CronTrigger Id is not valid after method executes (but it is in test context!)
        System.debug('My schedulable job id is ' + System.schedule('Test Job', '20 30 8 10 2 ?', new TestSchedulableFiring()));

        // This would fail with a `LimitException` - DML is not allowed.
        // Account a = new Account(Name = 'Testy Test Test');
        // insert a;
        // System.debug('My Account\'s Id is ' + a.Id);

        // Setting a savepoint is DML-equivalent - this also throws a `LimitException`
        // System.Savepoint sp = Database.setSavepoint();
    }
}

Performing DML results in just the LimitException expected, as does setting a save point - a DML-equivalent operation. Neither are allowed in this context.

However, the behavior of invoking Schedulable or Batchable classes from a Visualforce controller is stranger. These are both DML-ish operations, involving serializing and persisting the job data. But while neither actually works, we also don’t get a LimitException.

Both Database.executeBatch() and System.schedule() execute successfully, and each returns an apparently valid Id value for an AsyncApexJob or CronTrigger respectively. These Ids can be output with System.debug() and are visible in the logs if you preview the Visualforce page. The corresponding records, however, don’t exist following the completion of the controller (not even in a failed state), and the asynchronous jobs do not execute at all. In the batch case, neither start() nor execute() is called. It’s as if the asynchronous job was never invoked, or if the entire constructor is wrapped in a savepoint/rollback structure.

This behavior is unique to the constructor being run as the controller of the Visualforce page. In a test context or Anonymous Apex, with the exact same controller being instantiated directly, both asynchronous jobs complete successfully, and the AsyncApexJob and CronTrigger records can be inspected afterwards. Both batchable and schedulable jobs can be successfully invoked from an action method, including when the action method is fired by <apex:page action="{! myAction }"> upon page load.

Now, performing these kinds of operations in a controller constructor is a bad idea (and a security risk) for a wide variety of reasons. It’s not a missing capability. Rather, it seems to just be an interesting edge case of undefined behavior on the Salesforce platform, and something to watch out for in Visualforce-land.

Continuous Integration with Salesforce DX: Practices and Principles for All

Last week, I had the pleasure of giving a talk on continuous integration with Salesforce DX and CircleCI at PhillyForce ‘18. The talk built upon past work I’ve shared here, including:

Slide Screenshot

The full video will be on YouTube in the future. In the meantime, have a look at the GitHub repo, which includes all of the example CircleCI workflows and source code from the talk.

Integration Testing Off-Platform Code with Salesforce DX and `simple_salesforce`

I am a huge fan of the lightweight, easy-to-use simple_salesforce module for Python. I use simple_salesforce constantly, for everything from one-off data analysis scripts to sandbox setup automation to full-scale ETL solutions. As some of those solutions grow more complex or durable, I start to feel the need to built serious tests for them.

For software like this, of course, Python unit tests tell only a portion of the story. It’s critical to actually connect the tool to a Salesforce instance, kick it off, and make sure the results look right - an end-to-end integration test.

This kind of test isn’t repeatable in sandboxes or in a developer edition; “real” orgs get polluted, and it’s easy to introduce silent dependencies. That’s why Salesforce DX scratch orgs are a great solution for testing off-platform code just as much as Apex: we build a scratch org with a predictable shape and feature set, perhaps seed some standardized data, connect our integration tool, run our tests, and toss the scratch org.

With off-platform code there’s a couple of extra steps needed to get access to the scratch org, and we have two good options depending on how the integration typically connects to Salesforce. A demonstration of both techniques is available on my GitHub in a CircleCI context.

Access Token

The easiest (and best) route is if the integrated tool can use an access token and instance URL to connect to Salesforce. (This is supported by simple_salesforce). Once we’ve created our scratch org, a couple of SFDX and Python one-liners can extract these values and pass them to the tool under test:

export ACCESS_TOKEN=$(sfdx force:user:display --json | python -c "import json; import sys; print(json.load(sys.stdin)['result']['accessToken'])")
export INSTANCE_URL=$(sfdx force:user:display --json | python -c "import json; import sys; print(json.load(sys.stdin)['result']['instanceUrl'])")
python example-simple-salesforce.py -a "$ACCESS_TOKEN" -i "$INSTANCE_URL" -s

Some tools may use the environment variables, while others will need the access values on the command line. However the tool acquires these values, it can establish a connection to the scratch org with simple_salesforce:

connection = simple_salesforce.Salesforce(instance_url=instance_url, session_id=access_token, sandbox=True)

Username and Password

Scratch org access via username and password is a bit trickier. Salesforce DX scratch orgs don’t start with passwords, although we can easily create them. More importantly, though, it’s not possible (and likely never will be possible - Success Community login required) to set or obtain a scratch org user’s security token from the command line.

Fortunately, we can use SFDX to access the Metadata API. And with the Metadata API, we can deploy a small Security.settings entity that establishes all IP addresses as a trusted network, allowing us to bypass verification codes and the security token requirement.

(Is this a great idea, security-wise? Not especially. But we’re only implementing it on an ephemeral scratch org that will be recycled in a matter of minutes. Don’t do this to your production, sandbox, developer edition, or long-lived scratch org).

sfdx force:mdapi:deploy -d md-src -w 5 # md-src is where our `Security.settings` lives.

sfdx force:user:password:generate > /dev/null 
export PASSWORD=$(sfdx force:user:display --json | python -c "import json; import sys; print(json.load(sys.stdin)['result']['password'])")
export SF_USERNAME=$(sfdx force:user:display --json | python -c "import json; import sys; print(json.load(sys.stdin)['result']['username'])")
python example-simple-salesforce.py -p "$PASSWORD" -u "$SF_USERNAME" -t "" -s

The Metadata API deployment package is available on GitHub. We deploy the package first (with a -w wait of up to 5 minutes for the deployment to complete), then ask SFDX to generate a new password for the user. We redirect stdout to /dev/null because sfdx force:user:password:generate prints a human-readable message containing the password to stdout, and we don’t need that in our CI logs (although, again, we are throwing away this scratch org).

Finally, we once again apply Python one-liners to parse the JSON output of sfdx force:user:display and extract the username and password we need. We pass, in whatever format is most appropriate, an empty string to our tool where a security token would normally go, and that tool then authenticates to Salesforce:

connection = simple_salesforce.Salesforce(username=username,
                                          password=password,
                                          security_token=token,
                                          sandbox=sandbox)

The example tool in GitHub uses argparse to accept these values on the command line.

Performing Tests

Once we have access to SFDX, we’ll run one or more scenarios with our integrated tool. Wrapper scripts can use the same approach to gain access to the scratch org and check the results, making appropriate assertions along the way:

# ... having run a script that creates Accounts

connection = simple_salesforce.Salesforce(instance_url=instance_url, session_id=access_token, sandbox=True)
results = connection.query('SELECT Id, Name, AnnualRevenue FROM Account')

assert len(results.get('records')) == 12
assert 'Test Account 1' in [r['Name'] for r in results.get('records)]
# ... and so on. We have full access to review and evaluate the state of the scratch org post-testing.

Salesforce Lifecycle and Tooling: Testing on Multiple Org Types with Salesforce DX and CircleCI Workflows

Let’s suppose you’re running a successful continuous integration program, using Salesforce DX and CircleCI or another continuous integration provider. Your automated testing is in place, and working well. But the code you’re building has to work in a number of different environments. You might be an ISV, an open-source project, or an organization with multiple Salesforce instances and a shared codebase, and you need to make sure your tests pass in both a standard Enterprise edition and a Person Accounts instance, or in Multi-Currency, or a Professional edition, or any number of other combinations of Salesforce editions and features.

Salesforce DX and CircleCI make it very easy to automate running tests against these different Salesforce environments, and to do so in efficient, parallel, isolated testing streams. The process is built in three steps:

  1. Define organization types and features in Salesforce DX Scratch Org Definition Files in JSON format.
    • Optionally, define additional Metadata API or package installation steps to complete preparation of a specific org for testing.
  2. Define jobs in CircleCI configuration file, either by scripting each environment’s unique setup individually or by referencing a common build sequence in the context of each org type.
  3. Define a workflow in CircleCI configuration file that runs these jobs in parallel.

This article assumes that you’ve followed Salesforce Lifecycle and Tooling: CircleCI and Salesforce DX and are using a fairly similar config.yml. However, the principles are transferable to other continuous integration environments and build sequences.

Defining Organization Types and Features

Salesforce DX scratch org definitions don’t need to be complex, and are defined in JSON. This is a simple example that adds a feature (Sites) to the default configuration:

 {
    "orgName": "David Reed",
    "edition": "Enterprise",
    "features": ["Sites"],
    "orgPreferences" : {
        "enabled": ["S1DesktopEnabled"]
    }
}

The feature set that is accessible through the org definition file is still somewhat in flux. New features are being added, and some important facets are still not available. The best references for what is available are the Salesforce DX Developer Guide and the Salesforce DX group in the Trailblazer Community.

Org definition files live in the config directory in a DX project. When you create a scratch org, you provide a definition file with the -f switch; you’re free to add multiple definition files to your repository.

Note that we’re not here discussing the Org Shape feature, which is currently in pilot. Once Org Shape becomes publicly available, more capabilities will become available for defining and creating types of environment.

Define Jobs in CircleCI

Each organization definition we want to test against is represented as a job entry in the CircleCI config.yml.

version: 2
jobs:
  - build-enterprise
  - build-developer
  ...

We can define an arbitrary number of these jobs.

If we define jobs by copying and pasting the core SFDX build job, our config.yml can become unwieldy and difficult to maintain. If there’s a lot of setup work that significantly differs between the org definitions, it might be necessary nonetheless.

However, if the job definitions vary by little more than the name of the scratch org definition file, we can take advantage of YAML’s aliasing feature to template our core build instructions into each job, while using environment variables to define the differences between them.

Here’s what it looks like. (The complete config.yml file is available on GitHub).

job-definition: &jobdef
    docker:
        - image: circleci/node:latest
    steps:
        ...

&jobdef defines an alias, a name to which we can refer to include the following material, which we’ve factored out from the core config.yml developed previously. To that core build sequence, we make just one change, in the “Create Scratch Org” step:

    - run: 
        name: Create Scratch Org
        command: |
            node_modules/sfdx-cli/bin/run force:auth:jwt:grant --clientid $CONSUMERKEY --jwtkeyfile assets/server.key --username $USERNAME --setdefaultdevhubusername -a DevHub
            echo "Creating scratch org with definition $SCRATCH_DEF"
            node_modules/sfdx-cli/bin/run force:org:create -v DevHub -s -f "config/$SCRATCH_DEF" -a scratch

Note that we’re using a new environment variable, $SCRATCH_DEF, to store the name of the definition file we want to use. We’ll take advantage of that when we template this core mechanic into the individual jobs that define builds for each type of org.

Below this alias definition, at the top level of config.yml, we’ll start our jobs list:

version: 2
jobs:
  build-enterprise:
     <<: *jobdef
     environment:
        SCRATCH_DEF: project-scratch-def.json
  build-developer: 
     <<: *jobdef
     environment:
        SCRATCH_DEF: developer.json

Here, we define two jobs, one per scratch org. Each one includes the entire core build sequence &jobdef, including all of the build steps we’ve defined. Within each job, we assign a value to the environment variable $SCRATCH_DEF, which the build will use to create its unique scratch org.

Each of these jobs will run in a separate, isolated container, and each will use its own scratch org. We’ll get independent test results for each org definition, ensuring that our code’s behavior is correct in each org separately from the others.

This form can be extended even if your different org definitions require more configuration than is possible through the definition file. For example, each org might require installation of a different managed package with (for example) sfdx force:package:install -i $PACKAGE_ID. Or you might need to perform a different Metadata API deployment with sfdx force:mdapi:deploy -d "$MD_API_SRC_DIR" -w 10. Provided the build processes are structurally similar, templating and environment variables can help express them concisely and make the build easy to maintain.

There’s always the option, though, of copying and modifying our core build sequence into any individual job or set of jobs, making as many modifications as necessary. CircleCI will run them all the same, whichever route we take.

Complete the Process with CircleCI Workflow

The final step is to create a workflow entry in config.yml. The workflow ties together the different build jobs and expresses any dependencies between them. Lacking dependencies, the jobs will run in parallel, using as many containers as you have available.

workflows:
  version: 2
  test_and_static:
    jobs:
      - build-enterprise
      - build-developer
      - static-analysis

Here, we define a three-job workflow - one each for the two org definitions against which we want to test, and a third job for our PMD static analysis (see Integrating Static Analysis with PMD in the Salesforce Development Lifecycle). When we push to Git, CircleCI will initiate these three jobs in parallel. Each will succeed or fail individually, and you’ll get status indicators in GitHub for each job.

GitHub Results

The workflow as a whole shows success or failure aggregated from its component jobs, and you can rerun the entire workflow or individual failed jobs as needed.

So there we have it: our code, tested instantly and efficiently against as many different Salesforce orgs as we need - subject, of course, to your organization’s scratch org limits!

Everyday Salesforce Patterns: Child-Parent SOQL on Task and Event

Performing child-parent SOQL is more complex than usual when the Task and Event objects are involved. That’s because these objects include polymorphic lookup fields, WhoId and WhatId, which can point to any one of a number of different objects.

While a feature called SOQL polymorphism is in Developer Preview and would offer a SOQL-only way to obtain details about Task and Event parents in pure SOQL, unless and until it’s made generally available, Apex is required to query parent details for these objects other than a tiny subset of Name-related fields. This is an example of this pattern as it might be applied in a trigger. The core of the pattern is the following steps:

  1. Iterating over the Task or Event records and accumulating the WhatId or WhoId values on a per-object basis;
  2. performing a single SOQL query per parent object type;
  3. creating a Map<Id, sObject> to allowing indexing the WhoId/WhatId into query results;
  4. and finally iterating a second time over the Task or Event records to perform work with the parent information available through the Map.

This skeleton implementation sets a checkbox field called High_Priority__c on the Task when its WhatId is either an open Opportunity or an Account whose AnnualRevenue is greater than one million dollars. For an Account, we also set a field to indicate that a high-priority task is present on the parent. (This requirement, of course, is contrived). Note that the pattern works the same way whether we’re looking at WhoId or WhatId, and whether or not we’re in a trigger context.

trigger TaskTrigger on Task (before insert) {
    // In production, we would use a trigger framework; this is a simple example.
    
    // First, iterate over the set of Tasks and look at their WhatIds.
    // We can use the `What.Type` field to identify which parent object
    // it corresponds to, or cast the Id to a string and check the first three characters against the object's Key Prefix
    // We'll accumulate the WhatIds in Sets to query (1) for Account and (2) for Opportunity.
    
    Set<Id> accountIds, oppIds;
    
    accountIds = new Set<Id>();
    oppIds = new Set<Id>();
    
    for (Task t : Trigger.new) {
        if (String.isNotBlank(t.WhatId)) {
            if (t.What.Type == 'Account') {
                accountIds.add(t.WhatId);
            } else if (t.What.Type == 'Opportunity') {
                oppIds.add(t.WhatId);
            }
        }
    }
    
    // We will query into Maps so that we can easily index into the parent with our WhatIds
    Map<Id, Account> acts;
    Map<Id, Opportunity> opps;
    Map<Id, Account> actsToUpdate = new Map<Id, Account>();
    
    // Now we can query for the parent objects.
    // Here, the parent object logic is entirely contained in the query;
    // it could also be implemented in the loop below.
    acts = new Map<Id, Account>([SELECT Id FROM Account WHERE Id IN :accountIds AND AnnualRevenue > 1000000]);
    opps = new Map<Id, Opportunity>([SELECT Id FROM Opportunity WHERE Id IN :oppIds AND IsClosed = false]);
    
    // We re-iterate over the Tasks in the trigger set and alter their fields based on the information
    // queried from their parents. Note that this is a before insert trigger so no DML is required.
    
    for (Task t : Trigger.new) {
        if (acts.containsKey(t.WhatId) || opps.containsKey(t.WhatId)) {
            // With more complex requirements, we could source data from the parent object
            // Rather than simply making a decision based upon the logic in the parent queries.
            
            t.High_Priority__c = true;

            // We also want to update the parent object if it's an Account.
            if (t.What.Type == 'Account') {
                Account a = acts.get(t.WhatId);

                a.Has_High_Priority_Task__c = true;
                actsToUpdate.put(a.Id, a);
            }
        }
    }

    update actsToUpdate.values();
}

This example of the pattern assumes we’re starting from the Task and making some decision based on information in the parent. In other situations, we might query first for a set of Tasks in which we’re interested (perhaps applying a filter on WhatId or WhoId, or What.Type or Who.Type), follow a similar pattern to source parent information, and then update the parent records - or a different object entirely. The skeleton of the solution, however, will remain the same.