Improving Code Quality with Static Analysis

by Joseph D. Purcell

at DrupalCon Baltimore

April 27, 2017

Who am I?

My mentors...

@zendoodles
@craychee
@lukewertz
@stevector
@yesct
@crell
@becw

Who are you?

Typoglycemia

Woh nedes a sepll cehcker?

... computers aren't so forgiving

They do exactly as they are told,
no more and no less.

Donald Knuth Knuth (1968), "The Art of Computer Programming"

goto Fail


if ((err = SSLHashSHA1.update(&hashCtx, &signedParams)) != 0)
    goto fail;
    goto fail;
            

is executed like...


if ((err = SSLHashSHA1.update(&hashCtx, &signedParams)) != 0) {
    goto fail;
}
goto fail;
              

...the rest of the code is not executed!

The "goto fail" bug could have been caught by static analysis

Source: Martin Fowler

If you're not using static analysis
you're wasting time.

  • Trivial back and forth on code style
  • Bug resolution for obvious problems (dead code, typos, etc)
  • The "grind" on poor code (complex, highly coupled)

Overview

  1. What is static analysis?
  2. What can the analysis find?
  3. How do I do continuous inspection?

What is static analysis?

Anything you can learn from code without executing it.

Imagine goto fail in PHP


if (($err = SSLHashSHA1::update($hashCtx, $signedParams)) != 0)
  goto fail;
  goto fail;
            
Violations:
  • The method HashHandshake() utilizes a goto statement. (PHPMD)
  • Inline control structures are not allowed (PHPCS)
  • Line indented incorrectly; expected 4 spaces, found 6 (PHPCS)

How is this done?

Read the file off disk and look for a pattern.

Static Analysis Tools

  • PHP Code Sniffer (PHPCS) e.g. Drupal Coder
  • PHP Mess Detector (PHPMD)
  • PHP Copy and Paste Detector (PHPCPD)
  • PHP Lines of Code (PHPLOC)
  • Many more...

Let's try it out...

Check: Global variable access

protected function doLoadMultiple(array $names, $immutable = TRUE) {
  ...
  if (isset($GLOBALS['config'][$name])) {
    $this->cache[$cache_key]->setSettingsOverride($GLOBALS['config'][$name]);
  ...

Check: Unused variable or method parameter

  protected function formSingleElement(FieldItemListInterface $items, $delta...
    $entity = $items->getEntity();

    $element += array(
      '#field_parents' => $form['#parents'],
      // Only the first widget should be required.
      '#required' => $delta == 0 && $this->fieldDefinition->isRequired(),
      '#delta' => $delta,
      '#weight' => $delta,
    );

    $element = $this->formElement($items, $delta, $element, $form, $form_state);

    if ($element) { 
      ...
If it stinks, change it.

What can the analysis find?

Characteristics of Checks

Readability
How easy is it to read and understand?
Maintainability
How easy is it to bugfix and apply updates?
Extensibility
How easy is it to add a feature?
Security
How vulnerable is it to malicious users?
Testability
How easily can a test be written to verify behavior?
Correctness (i.e. bugs, technical debt)
How efficiently does it meet requirements?

Check: Global variable access

protected function doLoadMultiple(array $names, $immutable = TRUE) {
  ...
  if (isset($GLOBALS['config'][$name])) {
    $this->cache[$cache_key]->setSettingsOverride($GLOBALS['config'][$name]);
  ...
Smells:
  • Testability: You don't have control over global state, e.g. flaky test.
  • Security: No sanitization of inputs.
  • Extensibility: There is no abstraction possible.

Check: Unused variable or method parameter

  protected function formSingleElement(FieldItemListInterface $items, $delta...
    $entity = $items->getEntity();

    $element += array(
      '#field_parents' => $form['#parents'],
      // Only the first widget should be required.
      '#required' => $delta == 0 && $this->fieldDefinition->isRequired(),
      '#delta' => $delta,
      '#weight' => $delta,
    );

    $element = $this->formElement($items, $delta, $element, $form, $form_state);

    if ($element) { 
      ...
Smells:
  • Maintainability: It's unclear what $entity is supposed to do.
  • Correctness: Is the lack of use of $entity a bug?

Check: Dead code

  private function getProxyInstantiator()
  {
    if (!$this->proxyInstantiator) {
      $this->proxyInstantiator = new RealServiceInstantiator();
    }
    ...
Smells:
  • Maintainability: Why is this method still here?
  • Testability: You can't test code that can't be executed.
  • Correctness: Code that isn't executed is usually a bug.

Check: Number of public methods

class EntityType implements EntityTypeInterface { 
  public function get($property) {
  public function set($property, $value) {
  public function isStaticallyCacheable() {
  public function isRenderCacheable() {
  public function isPersistentlyCacheable() {
  public function getKeys() {
  public function getKey($key) {
  public function hasKey($key) {
  public function id() {
  public function getProvider() {
  public function getClass() {
  public function getOriginalClass() {
  public function setClass($class) {
  public function isSubclassOf($class) {
  public function getHandlerClasses() {
  public function getHandlerClass($handler_type, $nested = FALSE) {
  public function setHandlerClass($handler_type, $value) {
  public function hasHandlerClass($handler_type, $nested = FALSE) {
  public function getStorageClass() {
  public function setStorageClass($class) {
  ... 49 more methods!
Smells:
  • Extensibility: Likely difficult to vary behavior.
  • Correctness: Likely violates Single Responsibility Principle.

Check: Use of statics

class WorkspaceController extends ControllerBase {

  public function add() {
    $types = WorkspaceType::loadMultiple();
    if ($types && count($types) == 1) {
      $type = reset($types);
      return $this->addForm($type);
    }
    ...
Smells:
  • Testability: difficult to stub return of ::loadMultiple()
  • Extensibility: cannot replace WorkspaceType with subclass

Check: Missing doc comment

  public function update(WorkspacePointerInterface $source, WorkspacePointerInterface $target) { 
    return $this->doReplication($source, $target);
  } 
Smells:
  • Maintainability: update is generic, why is it just a wrapper to doReplication?
  • Extensibility: How do you know you've changed this class according to author's intent?

Check: NPath or Cyclomatic Complexity

  public function disableEntityTypes() { 
    foreach ($entity_types as $entity_type_id => $entity_type) { 
      try { 
        if ($storage->hasData()) { 
      catch (\Exception $e) { 
      if ($has_data[$entity_type_id]) { 
        if ($entity_type_id == 'file') { 
    foreach ($entity_manager->getDefinitions() as $entity_type) { 
      if ($entity_type->isSubclassOf(FieldableEntityInterface::CLASS)) { 
        foreach ($entity_manager->getFieldStorageDefinitions($entity_type_id)...
          if ($storage_definition->getProvider() == 'multiversion' && !$storage...
    foreach ($enabled_entity_types as $entity_type_id) { 
    foreach ($entity_types as $entity_type_id => $entity_type) { 
      if ($has_data[$entity_type_id]) { 
    ...
Smells:
  • Readability: We can't comprehend that much complexity.
  • Maintainability: Try to debug this!
  • Correctness: Likely violates Single Responsibility Principle.

Check: Efferent/Afferent Coupling

...
namespace Drupal\multiversion;

use Drupal\Core\Entity\ContentEntityInterface;
use Drupal\Core\Entity\ContentEntityTypeInterface;
use Drupal\Core\Entity\EntityTypeInterface;
use Drupal\Core\Entity\EntityManagerInterface;
use Drupal\Core\Entity\FieldableEntityInterface;
use Drupal\Core\State\StateInterface;
use Drupal\Core\Language\LanguageManagerInterface;
use Drupal\Core\Cache\CacheBackendInterface;
use Drupal\Core\Database\Connection;
use Drupal\multiversion\Workspace\WorkspaceManagerInterface;
use Symfony\Component\DependencyInjection\ContainerAwareInterface;
use Symfony\Component\DependencyInjection\ContainerAwareTrait;
use Symfony\Component\Serializer\Serializer;

class MultiversionManager implements MultiversionManagerInterface, ContainerAwareInterface {
...
Smells:
  • Extensibility: High coupling = hard to change.
  • Testability: You have to double all those classes!
  • Correctness: Likely violates Single Responsibility Principle.

Check: Space before parenthesis

...
    $promotion->apply($this->order);
    
    $this->assertEquals(1, count($this->order->getAdjustments()));
    $this->assertEquals(new Price('39.00', 'USD'), $this->order->getTotalPrice
    ());
...
Smells:
  • Readability: This looks different... but why?
  • Maintainability: Fixing creates merge conflict
    (Drupal's strict "only change what you have to")

How do I do
continuous inspection?

Code Climate

A static analysis platform, free for open source (on GitHub).

How to configure

Examples on GitHub:
github.com/josephdpurcell/code-climate-and-drupal

.codeclimate.yml

---
engines:
  phpmd:
    enabled: true
    config:
      file_extensions: "php,inc,module,install"
      rulesets: ".phpmd.xml"
  phpcodesniffer:
    enabled: true
    config:
      file_extensions: "php,inc,module,install"
      encoding: utf-8
      standard: "Drupal"
      ignore_warnings: true
ratings:
  paths:
  - "web/modules/custom/**.php"
  - "web/modules/custom/**.inc"
  - "web/modules/custom/**.module"
  - "web/modules/custom/**.install"
exclude_paths:
- conf
- web/core
- web/modules/contrib
- web/profiles
- web/sites
- web/themes
- web/*.php

.phpmd.xml

<?xml version="1.0"?>
<ruleset xmlns="http://pmd.sourceforge.net/ruleset/2.0.0" xmlns:xsi="http://www.w3...
  <description>
    A PMD Ruleset for Drupal coding standards.
  </description>

  <!-- Code Size -->
  <rule ref="rulesets/codesize.xml/CyclomaticComplexity"/>
  <rule ref="rulesets/codesize.xml/NPathComplexity"/>
  <rule ref="rulesets/codesize.xml/ExcessiveMethodLength"/>
  <rule ref="rulesets/codesize.xml/ExcessiveClassLength"/>
  <rule ref="rulesets/codesize.xml/ExcessiveParameterList"/>
  <rule ref="rulesets/codesize.xml/ExcessivePublicCount"/>
  <rule ref="rulesets/codesize.xml/TooManyFields"/>
  ...

Ways to use static analysis

  1. Development workflow (continuous inspection)
  2. Code audit

1. Development Workflow

Source: Continuous Integration by Paul Duvall

If you're not running
static analysis with your tests
then you're missing a critical part
of Continuous Integration (CI).

...and you're wasting time.

Features of a good static analysis tool for dev workflow:

  • Isolation of violations by commit or PR
  • Ability to address false positives, e.g. dismiss
  • Indicator of overall health, e.g. GPA
  • Weighted impact of violations on overall health
  • Open source or integration with open source tools like phpcs and phpmd
  • Ability to run static analysis locally, e.g. in PHPStorm

Code Climate is a good fit.

Daily workflow looks like...

  1. Run static analysis checks locally, e.g. in your editor
  2. Commit and push code
  3. Automated tests and static analysis run
  4. If build fails, contributor revises
  5. If build succeeds, code is reviewed and merged
Source: github.com/18F/micropurchase/pull/524
Source: github.com/18F/micropurchase/pull/466
Source: codeclimate.com/github/18F/micropurchase/pull/466

2. Code Audit

  1. Planned refactoring
  2. Sprint retrospectives
Drupal 8.3 code metrics generated by phpmetrics.org, see results.

What not to do...

Static analysis is only as good
as it's configured to be.

Corollary: Don't trust a GPA unless you know it was configured correctly.

Static analysis is not a
replacement for you're brain.

Don't compare GPAs of projects
that use different configurations.

Don't avoid static analyisis
because there's too much to tackle.

100% compliance with static analysis is a good goal sometimes

Don't only run static analysis locally--they should be shared just like tests.

Cost of Owning a Mess

As time increases, productivity declines

Savings of Static Analysis

  • You're not wasting time looking for code style
  • You're not worrying about obvious bugs or typos
  • You can clean as you go
  • You know the parts that are hard to change

What's next?

  • Agree on static analysis tool and config for D7 and D8, core, contrib, projects
    (phpcs can work but is missing phpmd and phpcpd)
  • Agree on process for versioning the checks and rolling out updates
  • Start using static analysis on Drupal Core and Contrib
    (see #1299710, Code Climate can solve this now?)
  • Ensure the community can use the same checks on their projects, e.g. Drupal engine on Code Climate
  • Ensure editors can integrate with the same checks, e.g. PHPStorm plugin for Code Climate
  • Add GPAs to compare contrib modules

Longer term ideas

  • Every PHP or Drupal CVE gets a check written for PHPCS
    (ease workload on security team? e.g. see Ruby's Brakeman)
  • Automated code quality spike tickets: CVEs or GPA threshold
  • Use static analysis reporting as a way to deprecate Drupal API calls or indicate improper usage
  • Use these checks in autocorrect: PHP Code Beautifier and Fixer (PHPCBF)

Thank you!

Come to the sprints on Friday!

Slides: github.com/josephdpurcell/code-climate-and-drupal

Want help setting up static analysis or Code Climate?

Find me at @josephdpurcell.

References and
Further Reading