What Can You `GROUP BY`?

The Salesforce documentation is notably terse in describing Considerations When Using GROUP BY. The guidance provided for determining which fields can be grouped is simply:

The Field object associated with DescribeSObjectResult has a groupable field that defines whether you can include the field in a GROUP BY clause.

This is a rather roundabout way to point to the method DescribeFieldResult.isGroupable(); for example, Account.Description.getDescribe().isGroupable() returns false.

Further, the document states that

You can’t use child relationship expressions that use the __r syntax in a query that uses a GROUP BY clause.

This gives us very little to go on, without examining the Describe information for every single field we might want to group on (doubly challenging in a Dynamic SOQL context). So which field types do, in fact, permit grouping? And does the final sentence prohibit the use of custom relationships in GROUP BY?

Failure to use a properly groupable field yields the error

field ‘FIELD_NAME__c’ can not be grouped in a query call

It turns out that groupability breaks down pretty cleanly along type lines, with a few interesting nuances. The underlying SOAP type appears to be the primary, but not the sole, determinant. Some fields within the same SOAP type differ in groupability based on other facets. Further, some formula fields can be used as groupings - but not the ones you might naively expect from other Salesforce Platform limitations!

Types are listed below by UI type, with the SOAP type in parentheses. This information was derived from inspection of numerous field describes via Workbench and the Tooling API.

Groupable Field Types

  • Checkbox (boolean)
  • Phone (string)
  • Picklist (string)
  • Email (string)
  • Text (string)
  • Text Area (string)
  • URL (string)
  • Number (int). Does not include custom fields, only standard Number fields with SOAP type int, like Account.NumberOfEmployees.
  • Lookup (id)
  • Id (id)
  • Date (date)
  • Direct cross-object references to groupable fields, up to 5 levels from the root object (SOQL limit), as in SELECT count(Id) FROM Contact GROUP BY Account.Parent.Parent.Parent.Parent.Name. Both custom and standard references are groupable.
  • Formulas of type Checkbox and Date, including cross-object formulas across standard and custom relationships.

Non-Groupable Field Types

  • Address Compound Fields
    • Components of Address compound fields are groupable if their types otherwise allow it.
  • Geolocations, both custom and standard, and whether or not defined as having decimal places, including the compound field and components (location/double)
  • Long Text (string)
  • Rich Text (string)
  • Auto Number (string)
  • Multi-Select Picklist (string)
  • Number (double), including custom Number fields with or without decimal and regardless of scale.
  • Percent (double), including custom Percent fields with or without decimal and regardless of scale.
  • Currency (double), including custom Currency fields with or without decimal and regardless of scale.
  • Roll-Up Summary Fields (double), including COUNT rollups.
  • Encrypted Text Fields (Classic Encryption; string)
  • Date/Time (dateTime)
  • Time (time)
  • Formulas of types other than Checkbox and Date, including the otherwise-groupable String type.

This post grew out of an interesting question on Salesforce Stack Exchange. I was intrigued by the lack of definition to this facet of SOQL and spent some time putting together a community wiki answer, which revealed that my original answer was mistaken: GROUP BY is stranger than I thought.

Dreamforce '18 Retrospective

Dreamforce ‘18 featured some truly outstanding sessions on the next generation of Salesforce technologies and development practices. Standouts included the excellent sessions showing how technologies like Platform Events, Change Data Capture, Unlocked Packages, and Force-DI lead to modular, loosely-coupled, and event driven Salesforce applications.

My session, Continuous Integration and Salesforce DX: Concepts and Connections, had over 350 registrants and yielded some excellent and important questions about moving to Salesforce DX practice. It was a great experience, and the talk is now available on YouTube.

Dreamforce Session Information

I successfully completed the Salesforce Data Architecture and Management Designer certification, which means I’ve reached the Certified Application Architect level. This was my primary goal for 2018 and I’m thrilled to complete one half of the Architect pyramid. On to System Architect, and Dreamforce 2019!

Application Architect

Continuous Integration and Salesforce DX at Dreamforce '18

I’m thrilled to be returning to Dreamforce next week as a speaker for the second time.

Dreamforce Session Information

If you’re interested in adding continuous integration with Salesforce DX to your development lifecycle, join me for a technical, conceptual dive into CI architecture and practice.

Bookmark the session here.

Templating with `String.format()` and `String.join()` for Better Dynamic SOQL

Dynamic SOQL with complex queries and filters can easily become an unreadable mess. Consider a query like this one:

String query = 'SELECT Id ' +
    'FROM ' + objectName +
    'WHERE ' + contactLookupName + '.Title = :title ' +
    'AND ' + filterField + ' = \'' + filterValue + '\'' +
    'AND ' + startDateField + ' <= ' + dateValue
    'AND ' + endDateField + ' >= ' + dateValue;

return query;

What it’s trying to do is query a dynamically-determined child object of Contact for records associated to Contacts with a specific title, and which are valid for a specific date.

This query also exhibits (at least) five common issues with Dynamic SOQL, in no particular order:

  1. Issues with spacing. A missing space after objectName results in a parse error at runtime, since Dynamic SOQL syntax cannot be checked at compile time. These errors are very easy to make in Dynamic SOQL and aren’t always apparent upon inspection.
  2. Hanging Apex variable binds. The :title bind will be evaluated against the local scope at the time this query string is passed to Database.query(), not at the time of its creation. This makes storing and passing query strings as parameters fraught with risk: either you must construct and use all query strings within a single scope, or you must not use Apex binding (for which see more below), or you must risk evaluating binds in a different scope and introduce a non-compiler-checkable dependency in your code.
  3. SOQL injection vulnerabilities. Presumably user-supplied values (filterField, filterValue) aren’t escaped with String.escapeSingleQuotes() to mitigate potential injection attacks.
  4. Incorrect Date format conversion. While it’s legal and compiles, implicit Date-to-String conversion by doing myString + myDate results in the wrong Date format for SOQL and will yield runtime errors. It’s critical to convert Dates to SOQL format with String.valueOf(myDate), or to use Apex binding instead (resurfacing issue 2).
  5. Lack of readability and maintainability. The query is hard to read. It’s hard to format the code well. It’s hard to parse out what is user input and what’s not, and to get a sense of what the query’s overall structure looks like.

I strongly prefer to use a query template string with Dynamic SOQL, coupled with String.format(). This structure helps maintain a clean separation between the static core and dynamic parameters of the query, and can help mitigate issues 1 and 5, while making it easier to see and address issues 2, 3, and 4. Here’s what the query above might look like in that form:

final String queryTemplate;

queryTemplate = 'SELECT Id FROM {0} WHERE {1}.Title = ''{2}'' AND {3} = ''{4}'' AND {5} <= {6} AND {7} >= {6}';

return Database.query(
    String.format(
        queryTemplate
        new List<String> {
            String.escapeSingleQuotes(objectName),
            String.escapeSingleQuotes(contactLookupField),
            String.escapeSingleQuotes(title),
            String.escapeSingleQuotes(filterField),
            String.escapeSingleQuotes(filterValue),
            String.escapeSingleQuotes(startDateField),
            String.valueOf(dateValue),
            String.escapeSingleQuotes(endDateField)
        }
    )
);

By making this switch, we increase the readability of the query, clearly showing which elements are dynamic and which are the static core. We eliminate spacing issues. The compile-time type checking on the List<String> ensures that we cannot (unless we have additional logic in the parameters to new List<String>{}) perform implicit Date conversion, or other implicit type conversions.

While we’re still vulnerable to SOQL injection, explicitly listing out our parameters helps make clear which user-controlled values might need to be escaped. (Note that we’re escaping every user-supplied parameter here, even those for which a rogue quote doesn’t pose an injection risk; this keeps the static analyzer happy!)

String.format() isn’t a silver bullet: you still have to remain aware of Dynamic SOQL issues.

While this structure encourages us to use string substitution in lieu of Apex binds, we can still use binds and still encounter dangling bind issues. We’d typically want to retain use of binds anywhere a collection is used with IN. It’s best to keep those binds within a single method, rather than returning a query string that contains bind values, to avoid those dangling references.

It’s easier to show that user input is correctly escaped in this format, but it’s still not enforced by the compiler. A good static analyzer is needed to locate such issues.

Lastly, String.format() comes with a few interesting wrinkles due to its inheritance of the format used by Java’s MessageFormat. The most obvious consequence of this is the need to repeat all single quotes in the template string, as above (''{0}''). See the Java documentation for more nuances involved in this approach.

Dynamic SELECT Clauses

Dynamic SOQL templating doesn’t do anything whatsoever for enforcement of field-level security, which we need to handle in our code. Building SELECT clauses dynamically is another common source of string errors, too. Luckily, we can build on String.format() templating to handle dynamic SELECT clauses cleanly by using collections and String.join().

It’s tempting to build a dynamic SELECT query by just performing string concatenation, but this approach is inelegant and error-prone.

String query = 'SELECT Id'; // base query
// Add fields the user picked
for (String fieldNameOne : userSelectedFields) {
    query = query + ', ' + fieldNameOne;
}
// Add other fields from a field set or Custom Metadata
for (String fieldNameTwo : someObject.fieldSetA) {
    if (!query.contains(fieldNameTwo)) {
        query = query + ', ' + fieldNameTwo;
    }
}
// Do more work to build the query...

is about as clean as this approach gets, and I’ve seen far worse. It’s easy to make simple mistakes that cause hard-to-debug syntax errors, field deduplication to avoid QueryExceptions is messy (note that the above example doesn’t actually work in all cases!), and it’s hard to handle FLS properly.

Much better is to build the SELECT clause by constructing a List<String> whose contents are the API names of the desired fields. This offers three benefits:

  1. Easy production of a final, valid SELECT clause by applying String.join(), and templating in with String.format().
  2. Easy deduplication of the list with List.contains().
  3. Easy enforcement of FLS.

One natural pattern for accumulating and querying a list of fields with FLS enforcement goes like this (note that this assumes none of the selected fields include relationship paths):

List<String> fieldList = new List<String> { 'id', 'name' };
List<String> userSuppliedFields = getUserSuppliedFields();
String sobjectName = 'Contact';

// Build a single deduplicated list of fields.
for (String s : userSuppliedFields) {
    if (!fieldList.contains(s)) {
        // List.contains() is a case-sensitive comparison.
        fieldList.add(s.toLowerCase());
    }
}

// Iterate over fieldList and check accessibility for the object we're using
Schema.DescribeSObjectResult dsr = Schema.getGlobalDescribe().get(sobjectName).getDescribe();

for (String f : fieldList) {
    if (!dsr.fields.getMap().get(f).getDescribe().isAccessible()) {
        // We don't have read permission on this field.
        // Perform some validation action - remove this field,
        // display a PageMessage or Toast, etc.

        // Note that we omit an important check - validating
        // that user input is a real field! `NullPointerException` 
        // can result from the above `if` if `getUserSuppliedFields() 
        // doesn't implement this.
    }
}

// then query:

String query = String.format(
    'SELECT {0} FROM {1}',
    new List<String> {
        String.join(fieldList, ', '),
        sobjectName
    }
);

So there we are: clean, secure, testable Dynamic SOQL with just a touch of the Standard Library.

Testing Chatter in Apex without `seeAllData=true`

One of the key limitations of the Connect API, which provides interaction with Chatter inside Apex, is that unit tests of Connect API code typically require the use of seeAllData=true:

Chatter in Apex methods don’t run in system mode, they run in the context of the current user (also called the context user). The methods have access to whatever the context user has access to. Chatter in Apex doesn’t support the runAs system method.

Most Chatter in Apex methods require access to real organization data, and fail unless used in test methods marked @IsTest(SeeAllData=true).

While some Connect API methods provide dedicated setTest...()-style methods to supply standardized data, others simply cannot be called at all in test context without the seeAllData=true annotation - something all good Apex programmers avoid like the plague!

In some very simple cases, it may be enough to guard a Connect API call with if (!Test.isRunningTest()) { }. If used as a one-liner, it’ll even still receive code coverage. But, of course, this abrogates our duty to validate code behavior in our unit tests. To do that, we need to apply dependency injection so that we can see what our code under test is doing without actually invoking the Connect API.

Here’s a simple and very explicit example. Note that I’m not using a mocking library or StubProvider here, but hand-building a mock for this situation. For simple needs in environments that don’t use a pervasive mocking framework, I like this explicit model because it’s direct and easy to follow.

The outer class PostsToChatter represents our class that calls the Connect API.

We establish an inner interface, ChatterPoster, that must be implemented by our mock or delegate class, an instance of another class we’ll pass in that does all the action work of calling the Connect API. Think of it as a very thin abstraction layer - at times, using this pattern can even help genericize code to, for example, support notifications over either Chatter or email, simply by supplying a different delegate.

public class PostsToChatter {
    public interface ChatterPoster {
        void postToChatter(Id parentRecord, String text);
    }
 
    @TestVisible private class RealChatterPoster implements ChatterPoster {
        public void postToChatter(Id parentRecord, String text) {
            if (!Test.isRunningTest()) callConnectAPIHere(parentRecord, text);
            // Note that if we keep this minimal (one line) we still obtain full code coverage even though the Connect API call does not actually go through.
        }
    }

    private ChatterPoster myPoster;

    public PostsToChatter() {
        // Add a default delegate (the real Chatter poster class)
        this(new RealChatterPoster());
    }

    // This constructor could be `@TestVisible private` if the delegate isn't part of our public API.
    public PostsToChatter(ChatterPoster delegate) {
        myPoster = delegate;
    }

    public void doSomethingRequiringAChatterPost() {
        // ... stuff happens ...
        myPoster.postToChatter(someId, someText);
    }
}

Then, in test context, we have the freedom to inject a completely different class that implements PostsToChatter.ChatterPoster, but doesn’t really call the Connect API. Meanwhile, because we’ve kept our actual-Chatter delegate so tiny and simple (and because we’ve included Test.isRunningTest() guards), we can still call it in test context to get the code coverage, even though we can’t validate its behavior like we ought to.

Here’s an example of what one approach to this test class could look like. (This omits the boilerplate test case to cover the RealChatterPoster inner class). In this simple implementation, our mock poster makes assertions when called about its own parameters. A more advanced implementation would store them for validation by the test class.

@isTest
public static class PostsToChatterTEST {
    private class MockChatterPoster implements PostsToChatter.ChatterPoster {
        public Id expectedId;
        public String expectedString;
        public Integer timesCalled = 0;

        public void postToChatter(Id parentRecord, String text) {
            System.assertEquals(expectedId, parentRecord, 'correct parent');
            System.assertEquals(expectedString, text, 'correct post body');
            timesCalled++;
        }
    }
    
    @isTest
    public static void testTheClass() {
        MockChatterPoster mp = new MockChatterPoster();
        PostsToChatter p = new PostsToChatter(mp);
        
        // ... invoke class functionality...
         
        System.assertEquals(1, mp.timesCalled, 'chatter post called');
        // Post contents are validated in the mock - or store the supplied parameters 
        // in the mock and validate them here, if preferred.
    }
}

Avoiding seeAllData=true is still possible! It’s true that many real-world Chatter applications will involve more complex uses of the large and sophisticated Connect API, of which the feed posting shown here is only a small selection. In those cases, it will take work - or might even be infeasible - to apply the dependency-injection approach. The refactoring effort is likely to pay off, however, in quicker, easier, more reliable testing, and easier extensibility for the future.

For many straightforward uses of Chatter, though, like posting of notifications, code similar to the above will be enough to cover the testing needs.