Hacker News new | past | comments | ask | show | jobs | submit login
Git-cliff – Generate changelog files from the Git history (github.com/orhun)
223 points by ducktective on Sept 5, 2021 | hide | past | favorite | 51 comments



While auto-generated changelogs aren't the best, they are better than nothing. Too often I've seen projects without a changelog which is especially annoying when dealing with breaking changes.

I've been considering switching to a changelog generator, either from Conventional Commits or from a folder of files just to avoid merge conflicts with the CHANGELOG file.

If people want enforcement of Conventional Commit, check out https://github.com/crate-ci/committed


I work with projects where this is utilized and I'm not sure they are better than nothing. Maybe I will change my mind someday, but up to this point, "conventional commits" are the stupidest thing ever. First off, this is not really a convention, more like a pretentious attempt to invent one: they are really bad defined and it isn't really clear when one or the other prefix should be used. As a result: if you are working alone (or your only teammate has very a mindset similar to yours), you won't need it, it will be just inconvenient. If you have a team of developers with various experiences and habits that need a convention to be enforced, it will be very hard to evaluate if they use prefixes correctly, let alone if commit messages are any good.

The real problem is that commit messages and a changelog serve 2 different purposes and have different audiences. Changelog exists to explain what happened with the product, and commit messages exist to explain what happened with the code. These are the same thing only in the most basic situations, like "change Delete button color to red" (and then you probably don't even want to clutter your changelog with such bullshit at all). So, this works when 1 Jira ticket equals 1 commit. This is not usually the case, and, what's more, this usually shouldn't be the case.

If your changelog audience is a project manager, for example, you are better off with generating if from ticket ids that you include into the commits that resolve some task (you don't need and probably don't want that atrocity of "conventional commits" for that). Changelog generated from "conventional commits" is verbose, clunky, sometimes outright false, misses most of important stuff if some things are resolved in libraries. It is simply bad. But arguably better than nothing if there are no tradeoffs. But there are.

The real problem is, now your commits are bad as well, because they are completely fucked up in an attempt to shape them into a changelog, which cannot be done effectively. Best case scenario, your commits now completely mirror your tickets and you end up with huge commits, but at least the changelog looks OK-ish. So the tool intended for developers no more is. Anything in between and both your commits and your changelog are messed up.


> it will be very hard to evaluate if they use prefixes correctly,

I faced this on a small team that adopted it. From my reading, it's OK if you create some of your own prefixes that suit your workflow. The point is to get everyone to use the same thing, not to try to say "feat/chore/etc" are the only things you can ever use. But... the team I was on didn't want to change anything, just use out of the box defaults, because "these are the community standards" (words to that effect).

I do not like being forced in to the "conventional commit" style as it gives the impression that it's providing something of value, when, in our case, it's not really valuable to anyone on the team. I did some other work in 2019 on a different team which was much more efficient and effective with their commits, review, merging, etc, but didn't use conventional commit.


Using pull-request descriptions for the auto-generated change-log helps give more flexibility for authors to describe their work. It's possible to embed media in the PR description. It' easier to encourage ample description in PR descriptions than commit messages. We often want to leave commits at a finer granularity than features, making it hard to leave a coherent story.


> I work with projects where this is utilized and I'm not sure they are better than nothing.

At least you know which commits caused a breaking change compared to spelunking the history to figure out how a library broke compatibility and how you need to adapt to it. Breaking changes are honestly one of the main reasons I read changelogs.

> The real problem is that commit messages and a changelog serve 2 different purposes and have different audiences. Changelog exists to explain what happened with the product, and commit messages exist to explain what happened with the code.

imo a commit summary and the start of the body should explain the the goal, the user facing aspect and then should dig down into the whats and whys of how the changed lines support that goal, using Inverted Pyramid / Bottom-line Up Front writing style.

Of course, if the commit has no end-user impact, you signify that with the type (feat/fix/etc vs chore/style/etc)

While doing a hand-curated changelog for end-users would be more polished, the summary of a commit message should still be passable.


IMO it's a good idea to add a Github Issues label for each conventional commit type and scope:

  <type>[optional scope]: <description>


  fix:
  feat:
  build:
  chore:
  ci:
  docs:
  style:
  refactor:
  perf:
  test:
  etc.

Let's say we have following scopes: core, infra, api, runtime. The add label in Github issues for each of them.


Not a bad idea, but why would or should adding a label to a Git commit message have anything to do with GitHub?


What I do is include the ChangeLog in the tag for the release.

I do semi-manually generate the ChangeLog using the titles of the Story/Bug that was merged in for a given PR, and then any commits that were done directly get a special notation (if any exist). The list of Stories/Bugs is generated automatically (branch->PR->Story).

This way there is no ever growing file, but the ChangeLog is available for every release within the VCS, and it's organized by release.


Nice project :)

I think this relies on you following conventions for commit messages. There could be be an interesting usecase for something like GTP-3 here: I'd love to not have to think too hard about carefully writing parsable commit messages but still be able have something scan my commit log, understand context and create a summarized changelog from it


We’ve been using charmixer/auto-changelog-action to generate release notes. This action makes nice references to GitHub pull requests. The release notes are attached to a GitHub release by an action. This turns out to be an invaluable reference for SQA and Product Manager for testing and creating customer-facing release notes.

One downside is that we run into trouble with GitHub API rate-limiting.

https://github.com/charmixer/auto-changelog-action


I don't see it documented in that action, but from the Dockerfile this appears to be wrapping github-changelog-generator [1].

[1]: https://github.com/github-changelog-generator/github-changel...


You're right; it's just packaging github-changelog-generator. Thanks for pointing that out.


Nice work right there. One thing that makes me appreciate it even more is the fact that it's written in rust without the authors trying to make it into a primary selling point, which is annoyingly common in these days("X written in rust").


What’s being ‘sold’ is the speed of Rust not the novelty of the language choice


Saying "written in Rust" as a proxy for performance is a bit like saying "X in Y kb" in JS-land for the same reason. It says nothing about actual performance. One needs to show benchmarks if they want to have a convincing performance argument.


Why would speed matter for something that runs a few regexps over some commit messages once a month? I would have chosen Perl for this task: good enough speed for the task, excellent built-in support for regexps, templating libraries like the Template Toolkit, and flexibility around things like loading custom modules at runtime and parsing configuration.


Got to admire how smoothly we went from "Props for not selling language x for this" to "I am selling language y for this" in just 3 consecutive posts.


Language choice is a legitimate decision, debating which one to use is a legitimate topic. Selling an-already existing product based on its language is questionable, as the parent pointed out.


It could matter in edge cases, perhaps someone runs this on google3 or the linux kernel and it takes 10 seconds versus a minute+ on something like Perl (not sure if that's the case).


I agree. If that's really one of the requirements then Rust or C++ or OCaml or something like that would make a lot of sense.

Also I'd like to add to my original comment above: I don't care at all that this is implemented in Rust. Good on you. Open source software goes where the developers go, and isn't dictated by anything except what the developers want to do.


We don't need more of the "git-[whatever]" commands to be in scripting languages. It's hard enough to bundle, distribute, and support as it is (especially on Windows). Though, yes, parts of it are already Perl so that probably would have been fine, but still a move in the wrong direction.


I would argue that people aggressively (perhaps annoyingly) evangelizing Rust is part of how it has come to be "normalized" now. People were doing the same thing Go several years ago.

Also I am not bothered by it; I personally find it interesting when a tool is implemented in a language that I find interesting. If it's a language I don't care about, I shrug and move on.

Are you bothered when somebody says "implemented in C 99"?

If you're at the point of looking at the source code, The language used for implementation might actually be relevant to you, when making a choice about whether to use a piece of software or library.


It bothers me because it may lead to overexploitation and turning a great language(rust, being my absolute favorite at this very moment) into an abomination from the depths of hell. Similar to what we've seen with many languages over the years.

Library - yes, of course it matters what it's written in when I'm choosing it. Software - no, not really, as long as it does what it's supposed to do, C, C++, go, erlang, java, kotlin, python, julia, rust or brainfuck for all I care - sure, god speed.


> Are you bothered when somebody says "implemented in C 99"?

Ironically, I felt this was your strongest argument because I am bothered when someone says that... like, I am first bothered that someone thinks the language their tool is written in is a selling point, but then I am additionally bothered (even somewhat enraged) that anyone considers their project being written in C99 (and not at least C++, and even explicitly so) to be a selling point!

Like, whenever I see "implemented in C99" I tend to be able to very quickly find a few buffer overflows or memory leaks as it is so hard to dot all your eyes and cross all your tees manually in every single function (with the consequent annoyance that I say the code is likely buggy, get challenged to find a bug, find multiple sometimes only even a few minutes later, and then the goal posts shift to "well we fixed the bugs you found, so we're fine")... and so I guess "implemented in Rust" us at least telling me the code that implemented the tool I am about to use is more likely to be correct? ;P


Fair enough.

> so I guess "implemented in Rust" us at least telling me the code that implemented the tool I am about to use is more likely to be correct?

This was my point! It's information for potential consumers, that might or might not be interesting or relevant to you.

Also some people just like talking about how they did something, not just what they did. Maybe they're proud to show it off or they want to let people know about whatever cool thing they like.


Cool, does it also work when you squash PR? One thing that annoys me about conventional commit is that they assume merge or rebass but if you squash features / bugs it's a one line change in your history. By default the auto generated message of a squash does not conform conventional commit


It’s good practice to review that you have an appropriate commit message in squashed commits before pushing.

not sure what you mean by “auto generated message” - “squash” during interactive rebase will by default ask you to edit the commit message before generating the commit, prefilled with concatenation of messages of all the squashed commits.


How many people actually customize that message?


At the very least, anyone contributing to a repo with contribution guidelines around commit message format is expected to.

Anyone who cares about others reading the commit history, should.


I'm building a changelog system at the moment. It's basically the github_changelog_generator gem, but without overwriting old changelog entries. In git-cliff terms, it has `--prepend CHANGELOG.md`. But the data it gives you is a list of closed issues and merged PRs instead of commits.

I like the prepend mode a lot, because I would like to write proper release notes above the autogenerated changes. Machines are only so good at this stuff. I think everyone should use these things in prepend mode.

I am not personally a big fan of conventional-commits, mostly because you can never edit the message if you mess it up, and I am more of a ten-commits-at-a-time person so it's hard to remember. I guess you can put it on a (PR) merge commit. But I also think there's a lot of value in scooping up data from the GitHub API, like github_changelog_generator does. Listing closed issues by whether they were closed in the time between tags is fantastic, it's like if you'd been using milestones all along. This would be a great addition to git-cliff: `--issues` to parse commit messages for GitHub-style issue-closing directives ("fixes #24") and build a list of closed issues to link or render freestanding. Same goes for "Merge pull request #25 from ...".


gnulib and a lot of GNU projects have this for a long time. https://github.com/coreutils/gnulib/blob/master/build-aux/gi...


We’ve been doing something similar at day job for a couple of years now, at least. Tried a few different things, but this cause us the fewest problems.

We have a monorepo with a dozen different products, supporting four rolling release series at any time. Some code is shared between products. So having a commit that contains the release note is very convenient. It’ll automatically follow merges, both when merging up bug fixes through all the release branches and merging in features.

When it’s time to build release notes, simply walk the new commits since last release and extract each release note.

Note, I’m leaving out most details on exactly how we have this setup. It’s not that complicated though.


This looks like it does what it sets out to do well. I think commits and changelogs serve separate purposes, but I have gone to commit some changes and wanted to at least see the text from the changelog enough times that I wrote a prepare-commit-message git hook to help out. It detects changelogs and starts out the commit message with any new lines, or puts a little (commented) message in the commit message if no changelogs were altered, kind of as a reminder to think about whether the changes being committed really should include updates to the changelog. It's been very helpful, so much so that I installed it as a global hook!


This only works if all team members have the exact same convention and discipline about commit messages which is rarely the case at which point a manually curated changelog is more useful than commit messages.


Do people actually write detailed commit messages? After years of buying into writing good commit messages, nowadays I usually just write "save" or "lol" and not once have I regretted it. If I'm looking through old commits I always search by the content of the commit, not the message. If I'm looking at someone else's code I just go ask them about it.


I am a prolific writer of comments and commit messages. It’s not common for me to write single-line commit messages¹, 4–8 lines is probably my most common size, and I’ve gone well above 100 on occasion. At my last place of employment, I think I had 17 of the 20 longest commit messages in the main repository.

I act like this on public and private repositories.

I know others I have worked with have benefited from and appreciated my detailed commit messages, and I’ve even thereby convinced some to write at least three or four decent lines rather than just one bad line.

I myself have certainly benefited from the verbose commit messages that my past self has written.

But here’s the real secret to it: I have benefited by writing these verbose commit messages, even in cases where it’s unlikely anyone (including me!) will ever read them again. You know rubber ducking <https://en.wikipedia.org/wiki/Rubber_duck_debugging>? Commit messages are about describing what you’ve done and why, and the act of writing that down helps you to think about it. More than a few times have I felt the need to justify the approach taken in a commit message, only to realise as I explain it that I had neglected some key consideration so that the approach I used wouldn’t handle certain corner cases, or that a better technique was possible.

I also write changelogs manually, because commit messages and changelogs have somewhat different purposes. And that helps, too, especially in case of deprecation or breaking change, where the changelog should ideally guide the reader in what they should change.

———

¹ Prominent likely exceptions are found in new projects where I’m doing a bit of everything in an unstructured fashion, and commits on my website repository that just add a new article.


In a real working environment you should be required to write good commit messages. It’s not about you, you may be okay asking someone else but other people might not want to ask you about what did you do in your “lol” commit. So yeah, we usually have pretty strict requirements for commit messages that at least we try to follow.


This attitude works for personal projects but will probably be somewhat career limiting working in a team.


According to Conventional Commit, what is the correct type and scope to use when upgrading dependencies?


I think it depends on the observable effect of the upgrade to a consumer of your component.

Did the dependency upgrade just improve performance? fix a bug? Add a new capability? Remove a backward compatible feature?

Certainly doesn’t automatically follow that if a dependency upgrade adds a feature, that that is a feature addition in your code, because you haven’t changed your code to use the new feature. But it could be - if you upgraded a parser library so it now supports strings longer than 4K, maybe your component now supports strings longer than 4K too and that merits a minor version bump.

It can be a pretty subtle judgement - especially with potentially deep dependency trees like are common in node.

If you’re bumping a dependency because the dependency has bumped one of its dependencies… can be tricky to figure out if there’s actually a noticeable effect.


That's "chore" in my company. The scope is usually the component (frontend/backend/etc.).


Is it a "public dependency"? In that case with a breaking change for your crate if it's a breaking change in the dep.


I use build type with scope of the package manager. For example:

build(poetry): update alembic dependency to 1.7.0


according to angular docs build is

build: Changes that affect the build system or external dependencies (example scopes: gulp, broccoli, npm)

so using build for things that affect /src seems wrong.


I was with you till the last line. Npm and Poetry are fulfilling the same role of external dependencies, what am I missing here?


A similar project, tightly integrated with Gradle: https://github.com/shipkit/shipkit-changelog


This is so awesome! I’ve been waiting for something like this written in Rust for a long time!


Out of curiosity, why does it matter what language it is written in?


I initially was interested in Rust because of performance + speed + safety, but now I have to say that cargo is a big selling point for me.

I always used to be scared of compiling software myself because I never seemed to be able to get it to work without endless headaches. Now, I generally find it easy to compile Rust programs if they aren't in my package manager, and with cargo install-update https://github.com/nabijaczleweli/cargo-update I find it easy to keep the software up to date. I have higher confidence that I can get hobbyist Rust software working, and the more Rust software I use, the more familiar I am with the ecosystem and the more comfortable I am.

If this was written in some obscure language I wasn't familiar with, I'd be less confident I would be able to run it at all, let alone keep it updated, and I may not bother even trying to install it.


Thank you. I really appreciate you sharing your perspective.


Also, some languages (like Go, C, C++, Rust) make it easier to get relatively self contained binaries, as opposed to a Python or Perl script with a lot of external dependencies that one has to install and manage.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: