I’m just generalizing, like if you want to copy some cleaver feature or modify some Python program you came across, what are the red or green flags indicating how well your (or particularly some hobbyist’s/your early learning self’s) results are likely to turn out?
Also how can you tell when reading into such a project is going to be a major project that is beyond the scope of you ultimate goals. For instance, I wanted to modify Merlin 3d printer firmware for hardware that was not already present in the project, but as an Arduino copy pasta hobbyist, despite my best efforts, that was simply too much for me to tackle at the time because of the complexity of the code base and my limited skills.
How do you learn to spot these situations before diving down the rabbit hole? Or, to put it another way, what advice would you give yourself at this stage of the learning curve?
TL;DR I look at the package first, before opening it. Unfortunately, often the packaging says a lot about the project
- A proper README
- A section in the README about how to setup a dev environment - without that it’s often just a guessing game I don’t have time for
- A section in the README about how to deploy or run the project
- Test cases that actually mean something, not just boilerplate
assert true
- Some kind of CI setup with steps for code quality (lint, format, type checking, …)
- Documentation generation
- Pull/merge requests / external project activity
Some programming languages still have either no tooling or communities that shun it. C, C++ and PHP come to mind. Many projects from those languages are written by old folk, people who only use editors (vim, emacs, nano, …) that don’t come with tooling out of the box, or assume a might amount of things about dev machines. They are languages and communities I shun.
Go, Python, Rust, Typescript, and other languages are chock-full of developer tooling with people using IDEs or at least editors with extensions/plugins for improved developer lives. Especially Rust and Typescript are often setup with a bunch of tooling that has to be actively ignored by the developer. Often it’s much easier to get started, run the project, run tests, read the code, load it into an IDE or editor with advanced language support (goto definition, refactoring, integrations for running and testing code, …).
A section in the README about how to setup a dev environment - without that it’s often just a guessing game I don’t have time for
Preferable to not require a special setup.
Even if you don’t have a special setup, having a section telling you that is still a helpful thing to quickly assess a new project.
I appreciate knowing that a project should Just Work with minimal setup so I don’t have to guess or make assumptions
Preferable to have a description of a setup at all instead of
“just”
./config && make install
Generally mostly by cyclomatic complexity:
-
How big are the methods overall
-
Do methods have a somewhat single responsibility
-
How is the structure, is everything inner-connected and calling each other, or are there some levels of orchestration?
-
Do they have any basic unittests, so that if I want to add anything, I can copypaste some test with an entrypoint close to my modifation to see how things are going
-
Bonus: they actually have linter configuration in their project, and consistent commonly used style guidelines
If the code-structure itself is good, but the formatting is bad, I can generally just run the code though a linter that fixes all the formatting. That makes it easier to use, but probably not something I’d actually contribute PRs to
How do you learn to spot these situations before diving down the rabbit hole? Or, to put it another way, what advice would you give yourself at this stage of the learning curve?
Probably some kind of metric of “If I open this code in an IDE, and add my modification, how long will it take before I can find a suitable entrypoint, and how long before I can test my changes” - if it’s like half a day of debugging and diagnostics before I even can get started trying to change anything, it’s seems a bit tedious
Edit: Though also, how much time is this going to save you if you do implement it? If it saves you weeks of work once you have this feature, but it takes a couple of days, I suppose it’s worth going though some tedious stuff.
But then again: I’d also check: are there other similar libraries with “higher scoring” “changeability metrics”
So in your specific case:
I wanted to modify Merlin 3d printer firmware
Is there any test with a mocked 3d printer to test this, or is this a case of compiling a custom framework, installing it on your actual printer, potentially bricking it if the framework is broken - etc etc
For me, it’s mostly Java and Kotlin. I look for the same kind of things. Things that I like to see:
- Short methods.
- Small classes
- Sensible packages
- Variables declared to Interfaces not implementations
- Single Responsibility Principle applied.
- DRY applied.
- Good names for variables and methods
- Few instance variables
- Few static members
- No comments, because you don’t need them
- Uses lambdas, Streams and Optional
- No empty Catch{} blocks
- No f*&^*cking! arrays.
I can generally tell in a few minutes if something is going to be a pain to work with.
Do they have any basic unittests, so that if I want to add anything, I can copypaste some test with an entrypoint close to my modifation to see how things are going
Do you mean their code is already setup with some kind of output to terminal that you can use to add a unit test into as well?
I don’t even recall what I was messing with awhile back, I think it was Python, but adding a simple print test didn’t work. I have no idea how they were redirecting print(), but that was a wall I didn’t get past at the time.
Bonus: they actually have linter configuration in their project, and consistent commonly used style guidelines
That stuff seems like a language on top of a language for me, and when it errors I get really lost. I usually have trouble with the whole IDE thing in the first place because I hate the path of least resistance exploitation thing. I keep trying to figure out the alternatives and often get lost in their complexities. I’ve tried most of them, but when I see the networking logs for most options people gravitate towards, I am appalled. I think I’m about to try a Neo Vim variant with a offline AI code completion option soon.
how much time is this going to save you if you do implement it? If it saves you weeks of work once you have this feature, but it takes a couple of days, I suppose it’s worth going though some tedious stuff.
I wouldn’t call anything I do “for work”. I get sick of something that annoys me and want to go in and fix the issue despite being completely unqualified, but naive enough to try. I don’t even know how to really research what alternate libraries might exist. I pretty much go hunting for examples and copy whatever they use. API documentation is often a hurtle for me because the typical short hand syntax makes a ton of assumptions about preexisting knowledge.
With stuff like Marlin, I seem to like the hardware side of things. I think I was looking to add a LCD display that was not a configuration option in Marlin and was trying to determine how to add a driver for it. That was simply far too ambitious for me a few years ago and still is.
Do you mean their code is already setup with some kind of output to terminal that you can use to add a unit test into as well?
I don’t even recall what I was messing with awhile back, I think it was Python, but adding a simple print test didn’t work. I have no idea how they were redirecting print(), but that was a wall I didn’t get past at the time.
Yea, probably not every language has a concept of unittests, but basically test code.
Like if you have a calculator, there would be a test (outside of the real project) of like
If Calculator.Calculate(2 + 2) then assert outcome = 4
That way - if lets say the calculator only does
+
operations - you could still copy that test line, and create a new test of
If Calculator.Calculate(5 * 5) then assert outcome = 25
Your test will fail initially, but you can just run through it in a debugger, step into the code, figure out where it’s most appropriate to add a
*
operator function, and then implement it, and see your test success.Other benefit of that is that if you submit your change as PR, with the test the repo maintainer doesn’t have to determine whether your code actually works just by looks, or actually running the calculator, your test proves you’ve added something useful that works (and you didn’t break the existing tests)
That stuff seems like a language on top of a language for me, and when it errors I get really lost.
If you’re just programming for yourself without the intent to submit it for a PR, you can just throw away the linter file. But I mentioned it was good to have in a project, because if there were multiple people working on it, all with their own style, the code can become a mess quite fast
I get sick of something that annoys me and want to go in and fix the issue despite being completely unqualified, but naive enough to try.
Well, I mean, that’s basically all the things right? You start completely unqualified, mess around for a while, and after a while you’re them more-qualified next time…
With stuff like Marlin, I seem to like the hardware side of things.
Just messing around with stuff you like is a good way to learn - though in my experience doing anything with hardware is way more difficult than just plain software. If you have to interface with hardware its very often pretty obscure stuff, like sending the correct hardware instructions to a driver, or to just “hardware pins” even… Like trying to start modifying a driver as a kind of starter project doesn’t sound like something I’d recommend
-
How do you learn to spot these situations before diving down the rabbit hole? Or, to put it another way, what advice would you give yourself at this stage of the learning curve?
Think of the absolute simplest thing you could do that is somewhat related to your actual goal. Now do something even simpler than that. Should be easy right?
This often happens for aspiring game devs. Everyone is like “I wanna make the next <insert AAA game that had 20 million dollar budget>”. But start by making Pong. Should be simple right?
The title question is very broad, varied, and difficult. It depends.
For anything that is not a small script or small, obvious and obviously scoped takes, you can’t assess at first glance.
So for a project/task/extension like you wrote it’s a matter of:
Is there docs or guide specifically for what I want to do/add? If yes, the expectation is it is viable and reasonably doable with low risk.
If there is no guide the assessment is already an exploration and analysis. How is the project structured, is there docs for it or my concerns, where do I expect it as to go and what does it have to touch, what’s the risks of issues and unknown difficulties and efforts. The next step would already be prototyping, and then implementing. Both of which can be started with a “let’s see” and timebox approach where you remain mindful of when known or invested effort or risk increases and you can stop or rethink.
- Use cloc to get an idea of line count.
- See how hard it is to find the entry point
- Try building it
- Try hacking a feature into it
- Try hacking a bigger feature
Thing is, for repos of any typical size, its going to take a lot more than a glance to know.
For example the Atom Editor codebase is immaculate, best formatting/commenting I’ve ever seen. But getting it to build is a nightmare, so its too hard to contribute to/hack.
Lots of times you just have to dive down the rabbit hole