For unit tests, you should know the input and expected output from the start. Really responsible devs write the unit tests first, because you should know what you're going to put in and what you'll get out before you start writing anything. If you find yourself making the same mistakes in your tests as you do your code, you might be trying to do too much logic in the test itself. Consider moving that logic to its own testable class, or doing a one-time generation of a static set of input/output values to test instead of making them on the fly.

How granular your tests should be is a matter of constant debate, but generally I believe that different file/class = different test. If I have utility method B that's called in method A, I generally test A in a way that ensures the function of B is done correctly instead of writing two different tests. If A relies on a method from another class, that gets mocked out and tested separately. If B's code is suitably complex to warrant an individual test, I'd consider moving to to its own class.

If you have a super simple method (e.g. an API endpoint method that only fetches data from another class), or something that talks with an external resource (filesystem, database, API, etc.) it's probably not worth writing a unit test for. Just make sure it's covered in an integration test.

Perhaps most importantly, if you're having a lot of trouble testing your code, think about if it's the tests or the code that is the problem. Are your classes too tightly coupled? Are your external access methods trying to perform complex logic with fetched data? Are you doing too much work in a single function? Look into some antipatterns for the language/framework you're using and make sure you're not falling into any pitfalls. Don't make your tests contort to fit your code, make your code easy to test.

If ever you feel lost, remember the words of the great Testivus.

[–] [email protected] 2 points 1 year ago* (last edited 1 year ago) (1 children)

First off, thanks for the help!

Really responsible devs write the unit tests first, because you should know what you’re going to put in and what you’ll get out before you start writing anything.

I've obviously heard the general concept, but this is actually pretty helpful, now that I'm thinking about it a bit more.

I've written pretty mathy stuff for the most part, and a function might return an appropriately sized vector containing what looks like the right numbers to the naked eye, but which is actually wrong in some high-dimensional way. Since I haven't even thought of whatever way it's gone wrong, I can't very well test for it. I suppose what I could do is come up with a few properties the correct result should have, unrelated to the actual use of it, and then test them and hope one fails. It might take a lot of extra time, but maybe it's worth it.

How do you deal with side effects, if what you're doing involves them?

[–] [email protected] 1 points 1 year ago (1 children)

For your vector issue, I'd go the route of some static examples if possible. Do you have a way to manually work out the answer that your code is trying to achieve?

For side effects, that may indicate what I referred to as tightly coupled code. Could you give an example of what you mean by "side effect"?

[–] [email protected] 1 points 1 year ago* (last edited 1 year ago)

For your vector issue, I’d go the route of some static examples if possible. Do you have a way to manually work out the answer that your code is trying to achieve?

Not necessarily. In this scenario I'd imagine it's a series of numbers as opposed to something more human-friendly exactly because there's internal complexity that's important but hard to manually survey, let alone generate. If you've worked with GANs at all, maybe it's a point in a latent space.

For side effects, that may indicate what I referred to as tightly coupled code. Could you give an example of what you mean by “side effect”?

I mean it in the standard functional language way, if you're familiar. There's an operation that happens at some step of an algorithm, and it changes a data structure which is referred to or updated at another step. Sometimes you can't really avoid it, because the problem itself has an interconnection like that.

A sorting algorithm example, if that doesn't make this too complicated.

Concurrency it's pretty much guaranteed to do it, so let's say we're trying to implement some sort of bespoke sorting algorithm, where each compare is large and complex enough we have bugs, and which runs in multiple threads.

If threads are interfering with each other in this program, how do you test for that? The whole thing won't give expected results, obviously, but another unsorted array or a failure to terminate doesn't tell you much. Each compare and each swap might look correct at first, and give properly typed results. Let's assume that each thread might traverse to anywhere in the array, so you can't just check when they're overlapping.