oinume journal

Scratchpad of what I learned

Detecting duplicated code in Golang with CPD and Jenkins

CPD (Copy Paste Detecor)

Detecting duplicated code is a good way to make source code clean. Since I couldn't find how to detect duplicated code in Golang, I sent a pull request that CPD(Copy Paste Detector) is able to accept Golang. CPD is a software included in PMD.

And there is a useful Jenkins plugin, DRY. It visualizes how much duplicated code exist. I'll show you how to make it visible with Jenkins.

Setting up Jenkins

Use docker-toolbox to shortcut installing jenkins. After installing docker-toolbox, you just type following command.

$ docker run -p 8080:8080 jenkins

And then web can access to http://192.168.99.100:8080/ and you'll see first Jenkins page.

Installing PMD Plugin

  • Click "Manage Jenkins" on a top page
  • Click "Manage Plugins"
  • Click "Available" Tab
  • Select "Duplicate Code Scanner Plug-in"
  • Click "Download now and install after restart"
  • Jenkins will be restarted after plugin installed

Create a job to detect duplicated code

We'll use terraform repository as an example.

  • Click "create new jobs" on a top page
  • Select "Freestyle project" and type name for your job.
  • Click Add build step and select "Execute shell" and paste following shell commands
export PMD=pmd-bin-5.3.3
if [ ! -d $PMD ]; then
  curl -o $PMD.zip -L -O http://downloads.sourceforge.net/project/pmd/pmd/5.3.3/pmd-bin-5.3.3.zip\?r\=http%3A%2F%2Fsourceforge.net%2Fprojects%2Fpmd%2Ffiles%2Fpmd%2F5.3.3%2F\&ts\=1440850447\&use_mirror\=jaist
  [ -e $PMD.zip ] && unzip $PMD.zip
fi

[ ! -d terraform ] && git clone https://github.com/hashicorp/terraform.git
   
$PMD/bin/run.sh cpd --minimum-tokens 100 --files terraform --language go --format xml > cpd.xml || echo
  • Click "Add post-build action" on "Post-build Actions" and select "Publish duplicate code analysis results"
    • Duplicate code results: cpd.xml
    • High priority threshold: 50 (default)
    • Normal priority threshold: 25 (default)
  • Save the job and run it.
  • We can see a link "Duplicate Code" on the job page

Duplicate Code View Jenkins DRY Plugin

Duplicated code of terraform Duplicated code of terraform

Options of CPD

  • --minimum-tokens: The minimum token length which should be reported as a duplicate. I recommend 70 - 100.
  • --files: a directory to check duplication.
  • --language: a programming language.
  • --format: xml or plain

For further information, type following.

pmd-bin-5.3.3/bin/run.sh cpd -h

Finally

We can detect duplicated code with CPD and Jenkins.