本文介绍一个统计Git代码仓库签入代码行数的批量统计脚本。
1. 单个Git代码仓库
对单个Git代码仓库,按提交者进行归类统计代码提交行数。
git log --format='%aN' | sort -u | while read name; do echo -en "$name\t"; git log --numstat --author="$name" | awk 'BEGIN{add=0;subs=0;loc=0} {if($1~/^[0-9]+/){add += $1; subs += $2; loc += $1 + $2 }} END {printf "%s\t%s\t%s\n", add, subs, loc }'; done;
请cd到指定代码仓库目录下,然后运行上面的命令,可以看到如下输出,
peipeihh 13594 275 13869
每列的数字意义如下,
- 第一列:代码提交人
- 第二列:提交的新增代码行数
- 第三列:提交的删除代码行数
- 第四列:所有提交的代码行数
2. 批量的Git代码仓库
若是有批量的Git代码仓库,则可以使用如下批量分析脚本。
# 1. parse the date of start and end
sinceDate=$1
untilDate=$2
if [ ! $sinceDate ]; then
sinceDate="1970-01-01"
fi
if [ ! $untilDate ]; then
untilDate=`date '+%Y-%m-%d'`
fi
echo "the period of analysis: sinceDate = $sinceDate, untilDate = $untilDate"
# 2. prepare the repository folder
repoTmp=$PWD
repoList="$PWD/repoList.txt"
repoLocalDir="$PWD/repoLocal"
repoReportFile="$PWD/tmpReport.txt"
repoFinalReportFile="$PWD/finalReport.txt"
if [ -d "$repoLocalDir" ]; then
rm -rf $repoLocalDir;
fi
mkdir -p $repoLocalDir;
if [ -f "$repoReportFile" ]; then
rm $repoReportFile;
fi
touch $repoReportFile;
if [ ! -f "$repoList" ]; then
echo "Cannot find repoList.txt, please create it and list all the git repositories to be analyzed, a format is as following, "
echo "```"
echo "git@gitee.com:pphh/simple-demo.git master"
echo "```"
exit 1
fi
# 3. clone the git repo into local and do the investigation
i=1
cat $repoList | while read line
do
cd $repoLocalDir
repoDef=( $line )
repoName=${repoDef[0]}
repoBranch=${repoDef[1]}
if [ -z $repoName ]; then
continue
fi
if [ -z $repoBranch ]; then
repoBranch="master"
fi
repoFolder="./$i-repo-$repoBranch"
let i++
echo
echo "start to clone the repository: $repoName, branch = $repoBranch, folder = $repoFolder"
git clone $repoName -b $repoBranch $repoFolder
echo "clone is completed!"
cd $repoFolder
echo "try to investigate the submission status of the repository: $repoName, branch = $repoBranch"
git log --format='%aN' | sort -u | while read name; do git log --numstat --author="$name" --since="$sinceDate" --until="$untilDate" | awk 'BEGIN {add=0;subs=0;all=0} {if($1~/^[0-9]+/){add += $1; subs += $2; all += $1 + $2 }} END {printf "'$name'\t%s\t%s\t%s\n", add, subs, all }' >> $repoReportFile; done;
done
# 4. merge the results
cat $repoReportFile | awk '{ newLines[$1]+=$2;deleteLines[$1]+=$3;all[$1]+=$4 } END {for (i in all) print i,newLines[i],deleteLines[i],all[i];}' > $repoFinalReportFile
rm $repoReportFile
echo
echo "name\tnew-code-lines\tdelete-code-lines\tall"
cat $repoFinalReportFile
演示步骤,
- 下载上面的脚本,并放到一个目录下,脚本命名为 analyzeGitRepo.sh。
- 在同一个目录下,创建repoList.txt文件,文件中列出所有需要分析的代码仓库,格式样例如下,
git@gitee.com:pphh/simple-demo.git master
git@gitee.com:pphh/blog.git master
- 运行脚本,命令格式如下
sh ./analyzeGitRepo.sh "2020-01-01" "2021-07-20"
一个输出结果如下,
% sh ./analyzeGitRepo.sh "2020-01-01" "2021-07-20"
the period of analysis: sinceDate = 2020-01-01, untilDate = 2021-07-20
start to clone the repository: git@gitee.com:pphh/simple-demo.git, branch = master, folder = ./1-repo-master
Cloning into './1-repo-master'...
remote: Enumerating objects: 1182, done.
remote: Counting objects: 100% (279/279), done.
remote: Compressing objects: 100% (195/195), done.
remote: Total 1182 (delta 66), reused 0 (delta 0), pack-reused 903
Receiving objects: 100% (1182/1182), 306.24 KiB | 1.76 MiB/s, done.
Resolving deltas: 100% (267/267), done.
clone is completed!
try to investigate the submission status of the repository: git@gitee.com:pphh/simple-demo.git, branch = master
start to clone the repository: git@gitee.com:pphh/blog.git, branch = master, folder = ./2-repo-master
Cloning into './2-repo-master'...
remote: Enumerating objects: 357, done.
remote: Counting objects: 100% (6/6), done.
remote: Compressing objects: 100% (6/6), done.
remote: Total 357 (delta 2), reused 0 (delta 0), pack-reused 351
Receiving objects: 100% (357/357), 2.16 MiB | 1.82 MiB/s, done.
Resolving deltas: 100% (88/88), done.
clone is completed!
try to investigate the submission status of the repository: git@gitee.com:pphh/blog.git, branch = master
name new-code-lines delete-code-lines all
peipeihh 1586 0 1586
huangyh 0 0 0
分析的报告同时也输出到了目录下的finalReport.txt文件。
3. 演示脚本
见如下代码仓库,
- https://gitee.com/pphh/simple-demo/tree/master/demo-gitrepo-codeline-analysis