Amaxa: A Multi-Object Data Loader for Salesforce

I’ve just released Amaxa, an open-source project I’ve been working on for several months. Amaxa is a multi-object ETL tool/data loader for Salesforce. It’s designed to extract and load whole networks of records, like a selected set of Accounts with all of their Contacts, Opportunities, Contact Roles, and Campaigns, in a single repeatable operation while preserving the relationships between those records.

Core use cases for Amaxa include sandbox seeding, data migration, and retrieving connected data sets. The current release of Amaxa is version v0.9.2, in beta. Feedback and bug reports, via GitLab Issues, are welcome. There’s a lot more to come.

Amaxa is built in Python with Salesforce DX, simple_salesforce, and salesforce_bulk. Instructions and examples are included in the GitLab repository.

Locating Salesforce Compound and Component Fields in Apex and Python

One of the odder corners of the Salesforce data model is the compound fields. Coming in three main varieties (Name fields, Address fields, and Geolocation fields), these fields are accessible both under their own API names and in the forms of their component fields, which have their own API names. The compound field itself is always read-only, but the components may be writeable.

For example, on the Contact object is a compound address field OtherAddress. (There are a total of four standard Address fields spread across the Contact and Account objects, with a handful of others across Lead, Order, and so on). The components of OtherAddress are

  • OtherStreet
  • OtherCity
  • OtherState
  • OtherPostalCode
  • OtherCountry
  • OtherStateCode
  • OtherCountryCode
  • OtherLatitude
  • OtherLongitude
  • OtherGeocodeAccuracy.

Similarly, Contact has a compound Name field, as do Person Accounts, with components like FirstName and LastName.

So, if we’re working in dynamic Apex or building an API client, how do we acquire and understand the relationships between these compound and component fields?

API

In the REST API, the Describe resource for the sObject returns metadata for the object’s fields as well. This makes it easy to acquire all the data we need in one go.

GET /services/data/v43.0/sobjects/Contact/describe

yields, on a lightly customized Developer Edition, about 250KB of JSON. Included is a list under the key "fields", which contains the data we need (abbreviated here to omit irrelevant data points):

"fields": [
    {
        "compoundFieldName": null,
        "label": "Contact Id",
        "name": "Id"
    },
    {
        "compoundFieldName": "null",
        "label": "Name",
        "name": "Name"
    },
    {
        "compoundFieldName": "Name",
        "label": "First Name",
        "name": "FirstName"
    }
]

Each field includes its API name ("name"), its label, other metadata, and "compoundFieldName". The value of this last key is either null, meaning that the field we’re looking at is not a component field, or the API name of the parent compound field. There’s no marker indicating that a field is compound.

This structure can be processed easily enough in Python or other API client languages to yield compound/component mappings. Given some JSON response (parsed with json.loads()), we can do

def get_compound_fields(response):
    return {
        field["compoundFieldName"] for field in response["fields"] if field["compoundFieldName"] is not None
    }

Likewise, we can get the components of any given field:

def get_component_fields(response, api_name):
    return [field["name"] for field in response["fields"] if field["compoundFieldName"] == api_name]

Both operations can be expressed in various ways, including uses of map() and filter(), or can be implemented at a higher level if the describe response is processed into a structure, such as a dict keyed on field API name.

Apex

The situation in Apex is rather different because of the way Describe information is returned to us. Rather than a single, large blob of information covering an sObject and all of its fields, we get back individual describes for an sObject (Schema.DescribeSobjectResult) and each field (Schema.DescribeFieldResult). (We can, of course, call out to the REST Describe API in Apex, but this requires additional work and an API-enabled Session Id).

Moreover, Schema.DescribeFieldResult does not include the critical compoundFieldName property.

… or rather, it isn’t documented to include it. In point of fact, it does contain the same data returned for a field in the API Describe call, as we can discover by inspecting the JSON result of serializing a Schema.DescribeFieldResult record.

Unlike some JSON-enabled Apex magic, we can get to this hidden value without actually using serialization. Even though it’s undocumented, these references compile and execute as expected:

Contact.OtherStreet.getDescribe().compoundFieldName

and

Contact.OtherStreet.getDescribe().getCompoundFieldName()

This makes it possible to construct Apex utilities like we did in Python to source compound fields and compound field components. In Apex, we’ll necessarily be a bit more verbose than Python, and performance is a concern in broad-based searches. Both finding compound fields on one sObject and locating component fields for one compound field take between 0.07 and 0.1 second in unscientific testing. Your performance may vary.

public class CompoundFieldUtil {
    public static List<SObjectField> getCompoundFields(SObjectType objectType) {
        Map<String, SObjectField> fieldMap = objectType.getDescribe().fields.getMap();
        List<SObjectField> compoundFields = new List<SObjectField>();
        Set<String> compoundFieldNames = new Set<String>();

        for (String s : fieldMap.keySet()) {
            Schema.DescribeFieldResult dfr = fieldMap.get(s).getDescribe();

            if (dfr.compoundFieldName != null && !compoundFieldNames.contains(dfr.compoundFieldName)) {
                compoundFields.add(fieldMap.get(dfr.compoundFieldName));
                compoundFieldNames.add(dfr.compoundFieldName);
            }
        }

        return compoundFields;
    }

    public static List<SObjectField> getComponentFields(SObjectType objectType, SObjectField field) {
        Map<String, SObjectField> fieldMap = objectType.getDescribe().fields.getMap();
        List<SObjectField> components = new List<SObjectField>();
        String thisFieldName = field.getDescribe().getName();
                
        for (String s : fieldMap.keySet()) {
            if (fieldMap.get(s).getDescribe().compoundFieldName == thisFieldName) {
                components.add(fieldMap.get(s));
            }
        }
        
        return components;
    }
}

Then,

System.debug(CompoundFieldUtil.getComponentFields(Contact.sObjectType, Contact.OtherAddress));

yields

14:15:14:523 USER_DEBUG [1] DEBUG (OtherStreet, OtherCity, OtherState, OtherPostalCode, OtherCountry, OtherStateCode, OtherCountryCode, OtherLatitude, OtherLongitude, OtherGeocodeAccuracy)

and

System.debug(CompoundFieldUtil.getCompoundFields(Contact.sObjectType));

yields

22:15:30:089 USER_DEBUG [1] DEBUG (Name, OtherAddress, MailingAddress)

Simple modifications could support the use of API names rather than SobjectField tokens, building maps between compound field and components, and similar workflows.


This post developed out of a Salesforce Stack Exchange answer I wrote, along with work on a soon-to-be-released data loader project.

Building Unit Test Learning Resources on Salesforce Stack Exchange

Unit testing and code coverage are perennial topics on Salesforce Stack Exchange. It’s not hard to see why: unit testing can be profoundly unintuitive for new developers (on any platform), and involves mastering techniques and concepts that are quite different from those needed to be productive in building applications.

Growing out of discussion about community challenges in managing lots of duplicative basic questions, I started a project to write up canonical answers to common questions on unit tests, and bring together the most important resources.

Here’s the first-fruits of that project, all of which were posted (by me and other writers) to StackExchange as community wikis. These articles can be expanded and evolved by the Salesforce Stack Exchange community to give learners a basic conceptual framework and connect them to the best resources available on how to test effectively on the Salesforce platform.

Community Wiki Questions on Unit Testing

Running Reports as Selected Users with JWT OAuth and the Reports and Dashboards API

Salesforce reporting introduces some fascinating complexities to data visibility and exposure, particularly for organizations using Private Organization-Wide Defaults.

The key complicating factor is this: when a Salesforce report is run, it’s run in the context of some user or another, and the records that are shown on the report are the ones that are visible to that user. This means that report distribution solutions have to be very careful to only show each user a report run in their own context - not someone else’s.

Suppose your organization builds a critical report that many users will need to review. It’s built to show “My Opportunities”, so each user will see only their own Opportunities, and the Opportunity Organization-Wide Default is Private. You add a criterion to the report to only show Opportunities that have your internal “Needs Attention” Checkbox set. Now: how do you make sure your users are regularly updated when they have Opportunities that require their review?

A naive solution would create one subscription to this report, say for Frank Q. Exec, and add all of the users who need to receive it as recipients:

Lightning Subscription

But this runs afoul of the principle mentioned above: the report’s context user is Frank, and the recipients of the report will see data as if they were Frank. From Salesforce:

IMPORTANT Recipients see emailed report data as the person running the report. Consider that they may see more or less data than they normally see in Salesforce.

This is unlikely to be an acceptable outcome.

Further, we can’t simply have Frank create many subscriptions to the same report, adding one user as both the recipient and the running user to each: Frank only gets five total report subscriptions, and he can only have one subscription to each report.

Of course, users can schedule reports themselves, in their own context, and they can run them manually, and we can build dynamic dashboards (which come with their own limits). But what if we really need to create these subscriptions for our users automatically, or allow our admins to manage them for thousands of users at a time? What if, in fact, we want to offer the users a bespoke user interface to let them select subscriptions to standard corporate reports, or run reports in their contexts to feed into an external reporting or business intelligence solution?

This is a question I’ve struggled with before, and I was excited to see Martin Borthiry propose the issue on Salesforce Stack Exchange. Here, I’d like to expand on the solution I sketched out in response to Martin’s question.

Background

There are two report subscription functionalities on Salesforce, and they work rather differently. Report subscriptions are summarized in the Salesforce documentation under Schedule and Subscribe to Reports.

On Classic, one can “Subscribe” to a report, and one can “Schedule Future Runs”. The nomenclature here is confusing: a Classic “Subscribe” asks Salesforce to notify us if the report’s results meet certain thresholds, but it’s not for regularly receiving copies of the report. We’re not going to look at this feature. “Schedule Future Runs” is equivalent to a report subscription in Lightning and is the feature corresponding to the business problem discussed above.

Classic Subscription

On Lightning, we simply have an option to Subscribe, as we saw above. There’s no Lightning equivalent to the Classic “Subscribe” feature.

So what happens when we subscribe to a report?

The Classic Schedule Future Runs and the Lightning Subscribe functionality is represented under the hood as CronTrigger and CronJobDetail records with the CronJobDetail.JobType field set to 'A', for Analytics Notification. You can find them in queries from the Developer Console or Workbench via queries like

SELECT CronExpression, OwnerId, CronJobDetail.Name FROM CronTrigger WHERE CronJobDetail.JobType = 'A'

Unfortunately, knowing this doesn’t help us very much. Neither CronTrigger nor CronJobDetail can be created directly in Apex or via the API, and the objects provide very little detail about existing report subscriptions. The Report Id, for example, is notable by its absence, and the Name field is just a UUID.

A more promising avenue for our use case is the Reports and Dashboards API, because it offers an endpoint to create an Analytics Notification.

POST /services/data/vXX.0/analytics/notifications

with a JSON body like this

{
  "active" : true,
  "createdDate" : "",
  "deactivateOnTrigger" : false,
  "id" : "",
  "lastModifiedDate" : "",
  "name" : "New Notification",
  "recordId" : "00OXXXXXXXXXXXXXXX",
  "schedule" : {
    "details" : {
      "time" : 3
    },
    "frequency" : "daily"
  },
  "source" : "lightningReportSubscribe",
  "thresholds" : [ {
    "actions" : [ {
      "configuration" : {
        "recipients" : [ ]
      },
      "type" : "sendEmail"
    } ],
    "conditions" : null,
    "type" : "always"
  } ]
}

The feature set shown here in JSON is at parity with the user interface, and has the same limitations. Adding a recipient for the subscription over the API, for example, suffers from the same visibility flaws as doing so in the UI. And the API doesn’t let us do what we truly want to - create report subscriptions for other users that run as those other users - because we cannot set the owner of the subscription programmatically.

… or can we?

While the Reporting and Analytics API doesn’t support setting the context user for a subscription, it always takes action as the user as whom we authenticate to the API. And that we can control.

While an admin can Login As a user to create a one-off subscription, we’re more interested here in industrial-strength solutions that can support thousands of users. So we’re going to build a script to create subscriptions by talking to the Reports and Dashboards API, using the Javascript Web Token (JWT) OAuth authentication mechanism. Why? Because the JWT flow is our only route to seamlessly authenticating as any (admin-approved) user, with no manual intervention or setup required on a per-user basis.

Setup: Connected Apps and Certificates

Setting up the JWT flow involves building a Connected App in Salesforce, under which our scripts will authenticate. JWT is secured using a certificate and associated public key/private key pair - Salesforce holds the public key, our script holds the private key.

This is the same mechanism used for authentication in many Continuous Integration solutions. I’m not going to rehash all of the details here, because they’re well-covered elsewhere. You can follow Salesforce’s steps in using SFDX for continuous integration, or read through my own article about setting up CircleCI with Salesforce DX.

When you’re finished building the Connected App, add the Profiles of each of the users who are to be subscribed to reports to the Connected App as a pre-approved Profile, or assign all of those users a Permission Set and assign that Permission Set as pre-approved on the Connected App. This ensures that we can authenticate to the API as those users without any intervention.

Building the Scripts

We’re going to stick to sketching out a solution here that can be adapted to many different business problems, as we discussed earlier. For simplicity, we’ll use Salesforce DX to handle the JWT authentication, even though we’re not using SFDX for development here. Because it’s my preferred scripting workflow, I’ll be using Python with simple_salesforce, but you could just as easily achieve this in Ruby, Java, JavaScript, or even just bash and curl.

The main job of our script is to login as a user and create a report subscription for them. We might build this towards a specific business process by adding scaffolding to, for example, query a custom object out of Salesforce to define which reports should be subscribed automatically for which users, but we’ll leave that elaboration to a later date. Once we’ve got that core functionality achieved, we can wrap it in the logic we need for specific applications.

Let’s put the key field (private key) from our JWT setup in a file called server.key. Put the username of the user we want to subscribe (who must be pre-authorized to the Connected App) in the environment variable $USERNAME and the Connected App’s Consumer Key in $CONSUMERKEY.

Then we can get an Access Token to make an API call into Salesforce, letting SFDX do the heavy lifting:

sfdx force:auth:jwt:grant --clientid $CONSUMERKEY --jwtkeyfile server.key --username $USERNAME -a reports-test
export INSTANCE_URL=$(sfdx force:org:display --json -u reports-test | python -c "import json; import sys; print(json.load(sys.stdin)['result']['instanceUrl'])")
export ACCESS_TOKEN=$(sfdx force:org:display --json -u reports-test | python -c "import json; import sys; print(json.load(sys.stdin)['result']['accessToken'])")

(If you have jq installed, you can simplify these one-liners).

Now we’ve established an authenticated session as $USERNAME, even though we do not have that user’s credentials or any setup for that user besides preauthorizing their profile on the Connected App, and we have the values we need (the Access Token and Instance URL) stored in our environment.

Now we’ll switch over to Python. A quick script grabs those environment variables and uses simple_salesforce to make an API call to generate the report subscription.

import simple_salesforce
import os
import sys

outbound_json = """
{
  "active" : true,
  "createdDate" : "",
  "deactivateOnTrigger" : false,
  "id" : "",
  "lastModifiedDate" : "",
  "name" : "New Notification",
  "recordId" : "%s",
  "schedule" : {
    "details" : {
      "time" : 3
    },
    "frequency" : "daily"
  },
  "source" : "lightningReportSubscribe",
  "thresholds" : [ {
    "actions" : [ {
      "configuration" : {
        "recipients" : [ ]
      },
      "type" : "sendEmail"
    } ],
    "conditions" : null,
    "type" : "always"
  } ]
}"""

# Use an Access Token and Report Id to add a Lightning report subscription for this user
# such that the report will run as that user.

access_token = os.environ['ACCESS_TOKEN']
instance_url = os.environ['INSTANCE_URL']

report_id = sys.argv[1]

sf = simple_salesforce.Salesforce(session_id=access_token, instance_url=instance_url)

sf.restful(
    'analytics/notifications',
    None,
    method='POST',
    data=outbound_json % report_id
)

Execute the script

python add-subscription.py $REPORTID

where $REPORTID is the Salesforce Id of the report you wish to subscribe the user for, and then if we log in as that user in the UI, we’ll find a shiny new Lightning report subscription established for them.

Lightning Final Subscription

Note that it’s set for daily at 0300, as specified in the example JSON.

Next Steps

We’ve got a proof-of-concept in place showing that we can in fact schedule results for users run as those users. In an article to follow soon, we’ll look at operationalizing this approach and building out business processes atop it.

Handling URLs in Lightning Data Tables

The <lightning:dataTable> component has built-in support for displaying links in table columns. The syntax looks something like this:

{
    label: 'Case Number', 
    fieldName: 'My_URL_Field__c',
    type: 'url', 
    typeAttributes: { 
        label: {
            fieldName: 'CaseNumber'
        } 
    },
    sortable: true 
}

typeAttributes.label.fieldName identifies a field on each row to utilize as the title of the link, while fieldName at the top level specifies the URL field itself.

In many cases, though, what we have in our sObject data isn’t a fully-qualified URL: it’s a Salesforce Id, a lookup to this record or to some other record, and we’d really like to display it sensibly as a link with an appropriate title. Unfortunately, <lightning:dataTable> doesn’t have an Id column type, and the url type is not clever enough to realize it’s been handed a record Id and handle it.

Instead, we need to generate the URL ourselves and add it as a property of the line items in the source data. (This is a bewildering shift for seasoned Apex programmers: we can just add fields to our sObjects?!) In the callback from the Apex server method querying our sObjects, we generate one or more synthetic properties:

cases.forEach(function(item) {
    item['URL'] = '/lightning/r/Case/' + item['Id'] + '/view';
}

Our column entry will end up looking like this:

{
    label: 'Case Number', 
    fieldName: 'URL',
    type: 'url', 
    typeAttributes: { 
        label: {
            fieldName: 'CaseNumber'
        },
        target: '_self'
    },
    sortable: true 
}

Then, the result’s just what you might think:

Lightning Data Table

The Case Number column is hyperlinked to open the related Case record.

Note that we’re using the Lightning form of the record URL (/lightning/r/<sObject>/<id>/view), and we’ve added the target: '_self' attribute to the typeAttributes map. This results in the link opening in the current tab, in non-Console applications, and loading a new Console tab in Sales or Service Console. The default behavior, if target is not specified, is to open a new browser tab, even in Console applications, which will often not be the desired behavior.

Using the Classic form of the record URL (/<id>) does work, but redirects through a translation URL. For future compatibility, it’s best to just use the current-generation URL format.

This process of synthesizing new fields for <lightning:dataTable> can be repeated arbitrarily, both for URL fields and for other types of calculated data, like icon columns. It’s important to remember, of course, that these synthetic properties cannot be persisted to the server because they’re not real sObject fields. Input to server actions must be constructed appropriately.

The JavaScript’s loosey-goosey type system and object model can be confusing for Apex programmers, but it offers a lot of freedom in return - and the ability to do things with sObjects we’d need a wrapper class to handle in Visualforce.