[dpdk-dev] scripts: check commit formatting

Message ID 1459286986-31148-1-git-send-email-thomas.monjalon@6wind.com (mailing list archive)
State Superseded, archived
Headers

Commit Message

Thomas Monjalon March 29, 2016, 9:29 p.m. UTC
  The git messages have three parts:
1/ the headline
2/ the explanations
3/ the footer tags

The headline helps to quickly browse an history or catch instantly the
purpose of a commit. Making it short with some consistent wording
allows to easily parse it or match some patterns.

The explanations must give some keys like the reason of the change.
Nothing can be automatically checked for this part.

The footer contains some tags to find the origin of a bug or who
was working on it.

This script is doing some basic checks on parts 1 and 3.

Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
---
 scripts/check-git-log.sh | 119 +++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 119 insertions(+)
 create mode 100755 scripts/check-git-log.sh
  

Comments

Yuanhan Liu March 30, 2016, 1:46 a.m. UTC | #1
On Tue, Mar 29, 2016 at 11:29:46PM +0200, Thomas Monjalon wrote:
> The git messages have three parts:
> 1/ the headline
> 2/ the explanations
> 3/ the footer tags
> 
> The headline helps to quickly browse an history or catch instantly the
> purpose of a commit. Making it short with some consistent wording
> allows to easily parse it or match some patterns.
> 
> The explanations must give some keys like the reason of the change.
> Nothing can be automatically checked for this part.

Actually, I think we might be able to do 2 tests here:

- space line between paragraphs.

- lines over 80 chars.

	--yliu
  
Bruce Richardson March 30, 2016, 2:27 p.m. UTC | #2
On Wed, Mar 30, 2016 at 09:46:34AM +0800, Yuanhan Liu wrote:
> On Tue, Mar 29, 2016 at 11:29:46PM +0200, Thomas Monjalon wrote:
> > The git messages have three parts:
> > 1/ the headline
> > 2/ the explanations
> > 3/ the footer tags
> > 
> > The headline helps to quickly browse an history or catch instantly the
> > purpose of a commit. Making it short with some consistent wording
> > allows to easily parse it or match some patterns.
> > 
> > The explanations must give some keys like the reason of the change.
> > Nothing can be automatically checked for this part.
> 
> Actually, I think we might be able to do 2 tests here:
> 
> - space line between paragraphs.
> 
> - lines over 80 chars.

75 chars for commit messages, and 50 for commit titles :-)

/Bruce
> 
> 	--yliu
  
Yuanhan Liu March 30, 2016, 2:44 p.m. UTC | #3
On Wed, Mar 30, 2016 at 03:27:40PM +0100, Bruce Richardson wrote:
> On Wed, Mar 30, 2016 at 09:46:34AM +0800, Yuanhan Liu wrote:
> > On Tue, Mar 29, 2016 at 11:29:46PM +0200, Thomas Monjalon wrote:
> > > The git messages have three parts:
> > > 1/ the headline
> > > 2/ the explanations
> > > 3/ the footer tags
> > > 
> > > The headline helps to quickly browse an history or catch instantly the
> > > purpose of a commit. Making it short with some consistent wording
> > > allows to easily parse it or match some patterns.
> > > 
> > > The explanations must give some keys like the reason of the change.
> > > Nothing can be automatically checked for this part.
> > 
> > Actually, I think we might be able to do 2 tests here:
> > 
> > - space line between paragraphs.
> > 
> > - lines over 80 chars.
> 
> 75 chars for commit messages, and 50 for commit titles :-)

I'd agree that 75 char is better. My personal preference is actually
68 :) 

	--yliu
  
Bruce Richardson March 30, 2016, 2:46 p.m. UTC | #4
On Wed, Mar 30, 2016 at 10:44:14PM +0800, Yuanhan Liu wrote:
> On Wed, Mar 30, 2016 at 03:27:40PM +0100, Bruce Richardson wrote:
> > On Wed, Mar 30, 2016 at 09:46:34AM +0800, Yuanhan Liu wrote:
> > > On Tue, Mar 29, 2016 at 11:29:46PM +0200, Thomas Monjalon wrote:
> > > > The git messages have three parts:
> > > > 1/ the headline
> > > > 2/ the explanations
> > > > 3/ the footer tags
> > > > 
> > > > The headline helps to quickly browse an history or catch instantly the
> > > > purpose of a commit. Making it short with some consistent wording
> > > > allows to easily parse it or match some patterns.
> > > > 
> > > > The explanations must give some keys like the reason of the change.
> > > > Nothing can be automatically checked for this part.
> > > 
> > > Actually, I think we might be able to do 2 tests here:
> > > 
> > > - space line between paragraphs.
> > > 
> > > - lines over 80 chars.
> > 
> > 75 chars for commit messages, and 50 for commit titles :-)
> 
> I'd agree that 75 char is better. My personal preference is actually
> 68 :) 
> 
I just go with what vim uses on my machine, because vim is always right :-)
  
Thomas Monjalon April 11, 2016, 10 a.m. UTC | #5
2016-03-30 15:27, Bruce Richardson:
> On Wed, Mar 30, 2016 at 09:46:34AM +0800, Yuanhan Liu wrote:
> > On Tue, Mar 29, 2016 at 11:29:46PM +0200, Thomas Monjalon wrote:
> > > The git messages have three parts:
> > > 1/ the headline
> > > 2/ the explanations
> > > 3/ the footer tags
> > > 
> > > The headline helps to quickly browse an history or catch instantly the
> > > purpose of a commit. Making it short with some consistent wording
> > > allows to easily parse it or match some patterns.
> > > 
> > > The explanations must give some keys like the reason of the change.
> > > Nothing can be automatically checked for this part.
> > 
> > Actually, I think we might be able to do 2 tests here:
> > 
> > - space line between paragraphs.
> > 
> > - lines over 80 chars.
> 
> 75 chars for commit messages, and 50 for commit titles :-)

The 75 chars limit is already checked by checkpatch.pl.
But yes we can have our own check in this script.

For the title, I think we can accept 60 chars and exceptionnaly more.

To see the history of title length:
	git log --format='%s' |
	awk '{lens[length($0)]++;} END {for (len in lens) print len, lens[len] }' |
	sort -g
  

Patch

diff --git a/scripts/check-git-log.sh b/scripts/check-git-log.sh
new file mode 100755
index 0000000..483abcb
--- /dev/null
+++ b/scripts/check-git-log.sh
@@ -0,0 +1,119 @@ 
+#! /bin/sh
+
+# BSD LICENSE
+#
+# Copyright 2016 6WIND S.A.
+#
+# Redistribution and use in source and binary forms, with or without
+# modification, are permitted provided that the following conditions
+# are met:
+#
+#   * Redistributions of source code must retain the above copyright
+#     notice, this list of conditions and the following disclaimer.
+#   * Redistributions in binary form must reproduce the above copyright
+#     notice, this list of conditions and the following disclaimer in
+#     the documentation and/or other materials provided with the
+#     distribution.
+#   * Neither the name of 6WIND S.A. nor the names of its
+#     contributors may be used to endorse or promote products derived
+#     from this software without specific prior written permission.
+#
+# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+# "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+# LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+# A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+# OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+# SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+# LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+# DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+# THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+# Check commit logs (headlines and references)
+#
+# If any doubt about the formatting, please check in the most recent history:
+#	git log --format='%>|(15)%cr   %s' --reverse | grep -i <pattern>
+
+range=${1:-origin/master..}
+
+headlines=$(git log --format='%s' $range)
+tags=$(git log --format='%b' $range | grep -i -e 'by *:' -e 'fix.*:')
+fixes=$(git log --format='%h %s' $range | grep -i ': *fix' | cut -d' ' -f1)
+
+# check headline format (spacing, no punctuation, no code)
+bad=$(echo "$headlines" | grep \
+	-e '	' \
+	-e '^ ' \
+	-e ' $' \
+	-e '\.$' \
+	-e '[,;!?&|]' \
+	-e ':.*_' \
+	-e '^[^:]*$' \
+	-e ':[^ ]' \
+	-e ' :' \
+	| sed 's,^,\t,')
+[ -z "$bad" ] || printf "Wrong headline format:\n$bad\n"
+
+# check headline label for common typos
+bad=$(echo "$headlines" | grep \
+	-e '^example[:/]' \
+	-e '^apps/' \
+	-e '^testpmd' \
+	-e 'test-pmd' \
+	-e '^bond:' \
+	| sed 's,^,\t,')
+[ -z "$bad" ] || printf "Wrong headline label:\n$bad\n"
+
+# check headline lowercase for first words
+bad=$(echo "$headlines" | grep \
+	-e '^.*[A-Z].*:' \
+	-e ': *[A-Z]' \
+	| sed 's,^,\t,')
+[ -z "$bad" ] || printf "Wrong headline uppercase:\n$bad\n"
+
+# check headline uppercase (Rx/Tx, VF, L2, MAC, Linux, ARM...)
+bad=$(echo "$headlines" | grep \
+	-e 'rx\|tx\|RX\|TX' \
+	-e '\<[pv]f\>' \
+	-e '\<l[234]\>' \
+	-e ':.*\<dma\>' \
+	-e ':.*\<pci\>' \
+	-e ':.*\<mtu\>' \
+	-e ':.*\<mac\>' \
+	-e ':.*\<vlan\>' \
+	-e ':.*\<rss\>' \
+	-e ':.*\<freebsd\>' \
+	-e ':.*\<linux\>' \
+	-e ':.*\<tilegx\>' \
+	-e ':.*\<tile-gx\>' \
+	-e ':.*\<arm\>' \
+	-e ':.*\<armv7\>' \
+	-e ':.*\<armv8\>' \
+	| sed 's,^,\t,')
+[ -z "$bad" ] || printf "Wrong headline lowercase:\n$bad\n"
+
+# check tags spelling
+bad=$(echo "$tags" |
+	grep -v '^\(Reported\|Suggested\|Signed-off\|Acked\|Reviewed\|Tested\)-by: [^,]* <.*@.*>$' |
+	grep -v '^Fixes: [0-9a-f]\{12\} (".*")$' |
+	sed 's,^.,\t&,')
+[ -z "$bad" ] || printf "Wrong tag:\n$bad\n"
+
+# check missing Fixes: tag
+bad=$(for fix in $fixes ; do
+	git log --format='%b' -1 $fix | grep -q '^Fixes: ' ||
+		git log --format='\t%s' -1 $fix
+done)
+[ -z "$bad" ] || printf "Missing 'Fixes' tag:\n$bad\n"
+
+# check Fixes: reference
+IFS='
+'
+fixtags=$(echo "$tags" | grep '^Fixes: ')
+bad=$(for fixtag in $fixtags ; do
+	good=$(git log --abbrev=12 --format='Fixes: %h ("%s")' -1 \
+		$(echo "$fixtag" | sed 's,^Fixes: \([0-9a-f]*\).*,\1,') 2>&-)
+	printf "$fixtag" | grep -v "^$good$"
+done | sed 's,^,\t,')
+[ -z "$bad" ] || printf "Wrong 'Fixes' reference:\n$bad\n"