Hello fellow rustaceans! Recently, there was a thread about how we can grow this community (how can I link to posts across servers?), where I already talked briefly about this topic, saying that I did not know if it is worthy of a full post here, as most things seem to be pretty professional looking links to talks and blogs. I’ve gotten some encouragement to post it, so here we go:
When to use a library instead of a CLI
I’m working on a little project called Autcrate in my free time, which aims to streamline the release and publishing process (what exactly it does isn’t really important for this discussion). Autocrate uses git to get the path of the current repo, tags and pushes releases, generates a change log from commits and so on.
Up until a week ago, I was just using the git2 library crate, which offers the functionalities of libgit2
for rust. While good, using this crate is much more complicated than for example just executing git push
from my program using std::process::command
. I am only using the porcelain functionalities of git (as of now), so all functionality could be achieved by calling the CLI interface.
Question
When is it acceptable to use CLI Commands instead of using libraries provided for that same software?
Is it generally better to use API/ABI from libraries, or is it maybe even better to use the CLI interface, reducing the list of dependencies?
Pro and Con of using Commands instead of libraries
Pro | Con |
---|---|
Reduces the dependencies of a crate | Adds a dependency that cannot be tracked by cargo |
Much easier to program for developers | The CLI interface is not versioned and might break in the future |
Documentation of the CLI interface is often better than of libraries | Bad usage of command cannot be detected at compile time |
Cli program might not be available depending on architecture or platform |
(this is of course not an exhaustive list. I will edit it if something comes up in the thread.)
Edit
Alright then. Thank you for your answers. While using the git CLI would probably be fine, since it’s very stable and available on most systems (especially those for CI/CD), it might change and is at best hacky. I’ll be doing the “right” thing and use libgit2 instead of just calling CLI commands.
Generally 100% Rust is the gold standard, followed by Rust wrapping some C library. The least preferable option would be invoking an external command and parsing its output. The reason for this is portability.
Pure Rust is going to be the most portable option as if it’s written right it should work on any architecture and OS supported by Rust. If it’s no-std it might even run bare metal.
The second most portable option is going to be wrapping a C library as now you’re dependent on that library being available for a given architecture and platform.
The least portable option is invoking a CLI app and parsing its output. You’re now dependent on that app being available for a given platform and architecture, but you’re also going to be dependent on any differences of its arguments and output. Apple for instance is infamous for using their own unique flavor of most CLI apps (usually a variant of BSD versions, while GNU versions are typical on Linux or even Windows).
As of now, I don’t really care about portability itself, but I do want to write good software. What you said makes sense to me, so I’ve decided I will be using the library version, even if that means I have to read up on it’s documentation to understand the git api. Thanks!
You might not care about portability to other platforms but parsing command line output can also easily break if someone uses the same OS but set to a different language or to different localization settings (e.g. decimal comma instead of point).
Here are some other reasons:
- users can install any version of a CLI command, so you could get weird breakage, whereas you control the version with a cargo dependency
- CLI commands can have variability with user configuration, whereas libraries typically don’t
But each of those can also be a benefit:
- you probably don’t need to patch your software if a security vulnerability lies in the CLI command, the user can just patch their system
- users being able to customize the CLI may make your app easier to write (i.e. you don’t need to reproduce those same configuration options)
So it really depends on your target use case. But in general, a library is probably better for code you intend to share with others or put in production.
Another reason to use libraries is communication. Would you prefer to receive a GitCommitResult in your code, or have to parse the stdout of the subprocess? If you need complex communication with the other program, then it needs to provide rpc or some other form of inter-process communication. A library avoids this issue.
I can’t speak for others, but personally, for anything that isn’t intended to be hacky or something like a CI/CD script, I use a library where possible. Specifically with Rust, libraries are statically linked, so there’s no need to worry about the interface/version being different on different machines at runtime, and the machine doesn’t (usually) need to install a dependency for your program. Libraries which wrap a CLI tool and call out to it internally are still useful even if the CLI tool isn’t statically linked since they already have all the glue to connect the code with the CLI tool set up, making it easier to use the tool usually.
This isn’t always possible of course, but if you’re calling out to something like git which has an extremely stable interface and has had all the features you need for many years, then the benefits of statically linking aren’t nearly as high. Also, sometimes the dependency is just too complex to be a statically linked library (like if you need to call cmake or something). Still, if some kind of popular wrapper library for these exist, I use them where possible so I don’t need to reimplement what the wrapper libraries already have.
Thanks for your answer. As far as I can see it, the git library isn’t too much to add to my dependencies, so I’ll be using that for reasons mentioned in other comments too. I believe it will help my software to become more stable and not have to be lucky with the environment we’re in.
In my mind I think that doing an execution of an external program is fine in scritps, but for more robust programs you should use libraries. There are of course exceptions, I once used libsvn instead of just calling the svn binary (as you understand, this was a long time ago, when svn was till the way to version software). libsvn turned out to be the most horrible library, so for that it would have been better to call the binary instead. But, in general, avoid runtime dependencies and errors if possible, also libraries normally allows for much better error handling. So, I always use a library if available and not obviously horrible.
Thank you for your answer. The git crate is still not that easy to understand, but the documentation is there and the interface is alright once you figured out what needs to be done, so I’ve decided to use it as a library.
Another aspect is that calling a cli command is way slower than a library function (in general). This is most apparent on short running commands, since the overhead is mostly fixed per command invocation rather than scaling with the amount of work or data.
As such I would at the very least keep those commands out of any hot/fast paths.