this post was submitted on 14 Mar 2024

82 points (97.7% liked)

Programming

18292 readers

60 users here now

Welcome to the main community in programming.dev! Feel free to post anything relating to programming here!

Cross posting is strongly encouraged in the instance. If you feel your post or another person's post makes sense in another community cross post into it.

Hope you enjoy the instance!

Rules

Follow the programming.dev instance rules
Keep content related to programming in some way
If you're posting long videos try to add in some form of tldr for those who don't want to watch videos

Wormhole

Follow the wormhole through a path of communities [email protected]

founded 2 years ago

MODERATORS

snowe

Ategon

[email protected]

Any tips to help a scientist become a better programmer? (iusearchlinux.fyi)

submitted 11 months ago by [email protected] to c/programming

70 comments fedilink hide all child comments

Hey there!

I'm a chemical physicist who has been using python (as well as matlab and R) for a lot of different tasks over the last ~10 years, mostly for data analysis but also to automate certain tasks. I am almost completely self-taught, and though I have gotten help and tips from professors throughout the completion of my degrees, I have never really been educated in best practices when it comes to coding.

I have some friends who work as developers but have a similar academic background as I do, and through them I have become painfully aware of how bad my code is. When I write code, it simply needs to do the thing, conventions be damned. I do try to read up on the "right" way to do things, but the holes in my knowledge become pretty apparent pretty quickly.

For example, I have never written a class and I wouldn't know why or where to start (something to do with the init method, right?). I mostly just write functions and scripts that perform the tasks that I need, plus some work with jupyter notebooks from time to time. I only recently got started with git and uploading my projects to github, just as a way to try to teach myself the workflow.

So, I would like to learn to be better. Can anyone recommend good resources for learning programming, but perhaps that are aimed at people who already know a language? It'd be nice to find a guide that assumes you already know more than a beginner. Any help would be appreciated.

top 50 comments

sorted by: hot top controversial new old

[–] [email protected] 54 points 11 months ago (1 children)

Approach programming with the same seriousness that you’d expect a programmer to approach your field with. You say yourself you just want it to “do the thing, conventions be damned”.

Well how would you feel if someone entered your lab or whatever and treated the tools of your trade that way?

[–] owsei 14 points 11 months ago (1 children)

I would agree if OP was trying to get a job as a developer, however I don't think they are.

It's more like you used a beaker for something and shook it to mix water and salt, it's not the recommended way, but it's fine.

[–] [email protected] 5 points 11 months ago (1 children)

At some point, they're gonna have to debug it.

load more comments (1 replies)

[–] robinm 30 points 11 months ago

Read your own code that you wrote a month ago. For every wtf moment, try to rewrite it in a clearer way. With time you will internalize what is or is not a good idea. Usually this means naming your constants, moving code inside function to have a friendly name that explain what this code does, or moving code out of a function because the abstraction you choose was not a good one. Since you have 10 years of experience it's highly possible that you already do that, so just continue :)

If you are motivated I would advice to take a look to Rust. The goal is not really to be able to use it (even if it's nice to be able able to write fast code to speed up your python), but the Rust compiler is like a very exigeant teacher that will not forgive any mistakes while explaining why it's not a good idea to do that and what you should do instead. The quality of the errors are crutial, this is what will help you to undertand and improve over time. So consider Rust as an exercice to become a better python programmer. So whatever you try to do in Rust, try to understand how it applies to python. There are many tutorials online. The official book is a good start. And in general learning new languages with a very different paradigm is the best way to improve since it will help you to see stuff from a new angle.

[–] [email protected] 19 points 11 months ago* (last edited 11 months ago)

Most of the "conventions" (which are normally just "good practices") are there to make the software easier to maintain, to make teamwork more efficient, to manage complexity in large code-bases, to reduce the chance of mistakes and to give a little boost in productivity.

For example, using descriptive names for variables (i.e. "sampleDataPoints" rather than "x") reduces the chances of mistakes due to confusing variables (especially in long stretches of code) and allows others (and yourself if you don't look at that code for many months) to pick up much faster what's going on there in order to change it. Dividing your code into functions, on the other hand, promotes reusability of the same code in many places without the downsides of copy & paste of the same code all over the place, such as growing the code base (which makes it costlier to maintain) and, worse, unwittingly copying and pasting bugs so now you have to fix the same stuff in several places (and might even forget one or two) rather than just fixing it in that one function.

Stuff at a higher, software design level, such as classes, are mean to help structure the code into self-contained blocks with clear well controlled ways of interaction between them, thus reducing overall complexity (everything potentially connecting to everything else is the most complex web of connection you could have) increasing productivity (less stuff to consider at any one point whilst doing some code, as it can't access everything), reduce bugs (less possibility of mistakes when certain things can only be changed by only a certain part of the code) and make it easier for others to use your stuff (they don't need to know how your classes works, only to to talk to them, like a mini library). That said, it's perfectly feasible to achieve a similar result as classes without using classes and using scope only, though more advance features of classes such as inheritance won't be possible to easilly emulate like that.

That said, if your programs are small, pretty much one use (i.e. you don't have to keep on using them for years) and you're not having to work on the code as a team, you can get away with not using most "conventions" (certainly the design level stuff) with only the downside of some loss in productivity (you lose code clarity and simplification, which increases the likelihood of bugs and makes it slower to transverse and spot stuff in the code when you have to go back and forth to change things).

I've worked with people who weren't programmers but did code (namelly with Quants in Finance) and they're simply not very good at doing what is but a secondary job for them (Quants mainly do Mathematical modelling) which is absolutelly normal because unlike with actual Developers, doing code well and efficiently is not what their focus has been in for years.

[–] MajorHavoc 16 points 11 months ago (2 children)

The O'Reilly "In a Nutshell" and "Pocket Guide to" books are great for folks who can already code, and want to pick up a related tool or a new language.

The Pocket Guide to Git is an obvious choice in your situation, if you don't already have it.

As others have mentioned, you're allowed to ignore the team stuff. In git this means you have my permission to commit directly to the 'main' branch, particularly while you're learning.

Lessons that I've learned the hard way, that apply for someone scripting alone:

git will save your ass. Get in the habit of using if for everything ASAP, and it'll be there when you need it
find that one friend who waxes poetic about git, and keep them close. Usually listening politely to them wax poetically about git will do the trick. Five minutes of their time can be a real life saver later. As that friend, I know when you're using me for my git-fu, and I don't mind. It's hard for me to make friends, perhaps because I constantly wax poetically about git.
every code swan starts as an ugly duck that got the job done.
print(f"debug: {what_the_fuck_is_this}") is a valid pattern that seasoned professionals still turn to. If you're in a code environment that doesn't support it, then it's a bad code environment.
one peer who reads your code regularly will make you a minimum of 5x more effective. It's awkward as hell to get started, but incredibly worth it. Obviously, you traditionally should return the favor, even though you won't feel qualified. They don't really feel qualified either, so it works out. (Soure: I advise real scientists about their code all the time. It's still wild to me that they, as actual scientists, listen to me - even after I see how much benefit I provide.)

[–] [email protected] 3 points 11 months ago (1 children)

print(f"debug: {what_the_fuck_is_this}") is a valid pattern that seasoned professionals still turn to. If you’re in a code environment that doesn’t support it, then it’s a bad code environment.

I've been known to print things to the console during development, but it's like eating junk food. It's better to get in the habit of using a logging framework. Insufficient logging has been in the OWASP Top 10 for a while so you should be logging anyway. Why not logger.debug("{what_the_fuck_is_this}") or get fancy with some different frameworks and logger.log(SUPER_LOW_LVL, "{really_what_the_fuck_is_this}")

You also get the bonus of not going back and cleaning up all the print statements afterward. All you have to do is set the running log level to INFO or something to turn all that off. There was a reason you needed to see that stuff in the first place. If you ever need to see all that stuff again the change the log level to whatever grain you need it.

[–] MajorHavoc 3 points 11 months ago* (last edited 11 months ago)

Absolutely true.

And you make a great point that: print(f"debug: {what_the_fuck_is_this}") should absolutely be maturing into logger.log(SUPER_LOW_LVL, "{really_what_the_fuck_is_this}")

Unfortunately I have found that when print("debug") isn't working, usually logging isn't setup correctly either.

In a solidly built system, a garbage print line will hit the logs and raise several alerts because it's poorly formatted - making it easy for the developer to find.

Sadly, I often see the logging setup so that poorly formatted logs go nowhere, rather than raising alerts until they're fixed. This inevitably leads to both debug logs being lost and critical but slightly misformatted logs being lost.

Your point is particularly valuable when it's time to get the system fixed, because it's easier to say "logging needs to work" than "fix my stupid printf", even though they're roughly equivalent.

Edit: And getting back to the scripting scientist context, scripting scientists still have my formal official permission to just say "just make my print('debug') work".

[–] [email protected] 3 points 11 months ago (1 children)

Along a similar vain to making a git friend, buy your sysadmins/ops people a box of doughnuts once in a while. They (generally) all code and will have some knowledge of what you are working on.

[–] MajorHavoc 1 points 11 months ago

That is great advice that has served me well, as well!

[–] [email protected] 15 points 11 months ago (2 children)

As one physicist to another, the most important thing in the code are long variable names (descriptive) and comments.

We usually do not do multi-people multi year projects, so all other comments in this page especially the ones coming from programmers are not that relevant. Classes are cool, but they are not needed and often obscure clarity of algorithmic/functional programming.

S. Wolfram (creator of Mathematica) said something along these lines (paraphrasing) if you are writing real code in Mathematica - you are doing something wrong.

[–] [email protected] 4 points 11 months ago* (last edited 11 months ago) (3 children)

We usually do not do multi-people multi year projects

Seriously - why not?

Say you're doing an experiment, wouldn't it be nice if someone else could repeat that experiment? Maybe in 3 years? in 30 years? in 3,000 years time? And maybe they could use your code instead of writing it themselves and possibly getting it wrong?

If something is worth doing, then it is worth doing properly.

Classes are cool, but they are not needed and often obscure clarity

I write code all day professionally. A lot of my code doesn't use classes. I agree they often "obscure clarity".

But sometimes they do the opposite - they make things crystal clear. It's important to know how to use classes and even more important to know when to use them. I guarantee some of the work you do could benefit from a few simple classes. They don't need to be complex - I wrote a class the earlier today that is only four lines of code. And yes, a class was apropriate.

[–] [email protected] 2 points 11 months ago

The reason they don't do multi people and multi year coding projects has nothing to do with repeatability of the experiments, most science coding is done through simple-ish code that uses existing libraries, doesn't code them. That code is usually stored in notebooks (jupyter, zeppelin) or simple scripts.

For science code, it usually falls in the realm of data analysis, and as a data engineer, let me tell you that the analysis part of the job is usually very ad-hoc modifications of the script and live coding though notebooks and such.

The part where whatever conclusion of the research is then transformed into a functioning application, taking care of naming conventions, the architecture of the system where the input data, the transformations, the postprocessing and such is done, is usually done by another team of dedicated data engineers or software developers.

I guess that it would be helpful for the analysis part to have standardized templates for data extraction and such, but usually the tools used in the research portion of the process and the implementation portion are completely different (python with tensorflow vs C++ with openvino or whatever cloud based) so it's not really fair to load the architecture design since the beginning.

load more comments (2 replies)

[–] UFODivebomb 1 points 11 months ago (1 children)

Great potatoes... This is not very good advice. Ok for prototypes that are intended to be discarded shortly after writing. Nothing more.

[–] [email protected] 2 points 11 months ago (1 children)

Yes, those prototypes are the goal here.

load more comments (1 replies)

[–] [email protected] 12 points 11 months ago (1 children)

Forget everything you hear about OOP and just view it as a way to improve code readability. Just rewrite something convoluted with a class and you'll se what they're good for. Once you've got over the mental blockade, it'll all make more sense.

[–] [email protected] 5 points 11 months ago (1 children)

To add to this, there are kinda two main use cases for OOP. One is simply organizing your code by having a bunch of operations that could be performed on the same data be expressed as an object with different functions you could apply.

The other use case is when you have two different data types where it makes sense to perform the same operation but with slight differences in behavior.

For example, if you have a “real number” data type and a “complex number” data type, you could write classes for these data types that support basic arithmetic operations defined by a “numeric” superclass, and then write a matrix class that works for either data type automatically.

[–] [email protected] 2 points 11 months ago (2 children)

One is simply organizing your code by having a bunch of operations that could be performed on the same data be expressed as an object with different functions you could apply.

Not OP, but also interested in wrapping my head around OOP and I still struggle with this in a few different respects. If what I'm writing isn't a full program, but more like a few functions to process data, is there still a use case for writing it in an OOP style? Say I'm doing what you describe, operating on the same data with different functions, if written properly couldn't a program do this even without a class structure to it? 🤔

Perhaps it's inelegant and terrible in the long term, but if it serves a brief purpose, is it more in the case of long term use that it reveals its greater utility?

[–] [email protected] 2 points 11 months ago* (last edited 11 months ago)

I use classes to group data together. E.g.

@dataclass.dataclass
class Measurement:
    temperature: int
    voltage: numpy.ndarray
    current: numpy.ndarray
    another_parameter: bool
    
    def resistance(self) -> float:
        ...

measurements = parse_measurements()
measurements = [m for m in measurements if m.another_parameter]
plt.plot(
    [m.temperature for m in measurements], 
    [m.resistance() for m in measurements]
)

This is much nicer to handle than three different lists of temperature, voltage and current. And then a fourth list of resistances. And another list for another_parameter. Especially if you have more parameters to each measurement and need to group measurements by these parameters.

[–] [email protected] 2 points 11 months ago* (last edited 11 months ago) (1 children)

Say I'm doing what you describe, operating on the same data with different functions, if written properly couldn't a program do this even without a class structure to it? 🤔

Yeah thats kinda where the first object oriented programming came from. In C (which doesn’t have classes) you define a struct (an arrangement of data in memory, kinda like a named tuple in Python), and then you write functions to manipulate those structs.

For example, multiplying two complex vectors might look like:

ComplexVectorMultiply(myVectorA, myVectorB, &myOutputVector, length);

Programmers decided it would be a lot more readable if you could write code that looked like:

myOutputVector = myVectorA.multiply(myVectorB);

Or even just;

myOutputVector = myVectorA * myVectorB;

(This last iteration is an example of “operator overloading”).

So yes, you can work entirely without classes, and that’s kinda how classes work under the hood. Fundamentally object oriented programming is just an organizational tool to help you write more readable and more concise code.

[–] [email protected] 1 points 11 months ago

Thanks for elaborating! I'm pretty sure I've written some variations of the first form you mention in my learning projects, or broken them up in some other ways to ease myself into it, which is why I was asking as I did.

[–] [email protected] 9 points 11 months ago* (last edited 11 months ago)

Learn Haskell.

Since it is a research language, it is packed with academically-rigorous implementations of advanced features (currying, lambda expressions, pattern matching, list comprehension, type classes/type polymorphism, monads, laziness, strong typing, algebraic data types, parser combinators that allow you to implement a DSL in 20 lines, making illegal states unrepresentable, etc) that eventually make their way into other languages. It will force you to learn some of the more advanced concepts in programming while also giving you a new perspective that will improve your code in any language you might use.

I was big into embedded C programming years back ... and when I got to the pointers part, I couldn't figure out why I suddenly felt unsatisfied and that I was somehow doing something wrong. That instinct ended up being at least partially correct. I sensed that I was doing something unsafe (which forced me to be very careful around footguns like pointers, dedicating extra mental processes to keep track of those inherently unsafe solutions) and I wished there was some more elegant way around unsafe actions like that (or at least some language provided way of making sure those unintended side effects could be enforced by the compiler, which would prevent these kinds of bugs from getting into my code).

Years later, after not enjoying JS, TS (IMO, a porous condom over the tip of JavaScript), Swift, Python, and others, my journey brought me to FRP which eventually brought me to FP and with it, Haskell, Purescript, Rust, and Nix. I now regularly feel the same satisfaction using those languages that I felt when solving a math problem correctly. Refactoring is a pleasure with strictly typed languages like that because the compiler catches almost everything before it will even let you compile.

[–] [email protected] 9 points 11 months ago

While there are lots of programming courses out there, not many of them will explicitly teach you about good programming principles. Here are a couple things off the top of my head:

High cohesion, low coupling. That is, when you divide up code into functions and classes, try to minimize the number of things going between those functions (if your functions regularly have 6+ arguments, that's a red flag and should be reviewed). And when something needs to be broken up into pieces, try to find the spots where there are minimal points of contact.
Try to divide code between functions and files in a way that doesn't feel too busy. If there are a bunch of related functions that are cluttering up one file, or that are referenced from multiple places, consider making a module for those. If you're not sure what "too busy" means...
Read a style guide. There are lots of things that will help you clean up and organize your code. The guide won't necessarily tell you why to do each thing, but it's a great tool when you don't have another point of reference.

If you have a chance to take a "Software Engineering 101" class, this is where you'd learn most of the basic principles for writing better code.

[–] ericjmorey 9 points 11 months ago (1 children)

Do you want to work as a developer? Or do you want to want to continue with your research and analysis? If you're only writing code for your own purposes, I don't know why it matters if it's conventional.

[–] [email protected] 5 points 11 months ago

I guess if you are unlikely to go back and change it, or understand how it works, then sure. And yeah that happens.

I write scripts and utilities like that. Modularity is overkill although I do toss in a comment or two to give a hint to future me, just in case.

Although tbf, I took plenty of CS classes and some of the instructors beat best practices into our heads... So writing sloppy, arcane, spaghetti code causes me to flinch...

[–] [email protected] 7 points 11 months ago (3 children)

The thing to think about is reusability. Are you copying and pasting code into multiple places? That's a great candidate to become a class. If you have long lived projects (i.e. something you will use multiple times over a lot of years) maintainability is important. Huge functions and monolithic applications are very hard to maintain over time.

Break your functionality out into small chunks (methods and classes). Keep it simple. It may take a while to get used to this, but your time for adding additional functionality will be greatly improved in the long run.

A lot of great programmers were terrible at one time. Don't let your current lack of knowledge of principles stop you from learning. One of the biggest breakthroughs I had as a programmer is changing how I looked at architecting applications. Following SOLID principles will assist a lot in that. Don't try to understand and use these principles all at once, take your time. Programming isn't what you make your living with, it's a tool to help you be more efficient in your current role.

Realize that becoming a more effective programmer is different for everyone. Like you, I was self taught. I was a systems and network engineer that decided to move into software development. I've since moved into a role that takes advantage of all the skills I've learned through the years in SRE. like you, a lot of what I write now is about automation and analysis.

load more comments (3 replies)

[–] [email protected] 5 points 11 months ago

There's a certain amount of fundamentals you need, after that point it's quite easy to hop languages by just looking over the documentation of that language. If you skip those fundamentals, you end up with a bunch of knowledge but don't realize you could do things way more effectively.

My recommendation: check out free resources for beginners and skip the atuff you already know thoroughly, focusing only on the stuff you don't know.

source: I'm self-taught and had to go through this process myself.

[–] [email protected] 5 points 11 months ago

Could be good to try to 'reset' your brain, by learning an entirely new programming language. Ideally, a statically typed, strict language like Rust or Java, or Scala, if you happen to have a use for it in data processing. They'll partially force you to do it the proper way, which can be eye-opening and will translate backwards to Python et al.
Just in general, getting presented the condensate of a different approach to programming, by learning a new language, can teach a lot about programming, even if you're never going back to that language.

For learning more about Git, I can recommend Oh My Git!. It takes a few hours to get through. In my experience, it's really useful to have at least seen all the tools Git provides, because if something goes sideways, you can remedy it with that.

[–] [email protected] 5 points 11 months ago (1 children)

All the other comments are great advice. As an ex chemist who does quite a bit of code I'll add:

Do you want code that works, or code that works?! It's reasonably easy to knock out ugly code that only works once, and that can be just what you need. It takes a little more effort however to make it robust. Think about how it can fail and trap the failures. If you're sharing code with others, this is even more important a people do 'interesting' things.

There's a lot of temporary code that's had a very long life in production, this has technical debt... Is it documented? Is it stable? Is it secure? Ideally it should be

Code examples on the first page of Google tend to work ok, but are not generally secure, e.g doing SQL queries instead of using prepared statements. Doesn't take much extra effort to do it properly and gives you peace of mind. We create sboms for our code now so we can easily check if a component has gained a vulnerability. Doesn't mean our code is good, but it helps. You don't really want to be the person who's code helped let an attacker in.

Any code you write, especially stuff you share will give you a support and maintenance task long term. Pirate for it!

Code sometimes just stops working. - at least I'm my experience. Sacrifice something to the gods and all will be fine.

Finally, you probably know more than you think. You've plenty of experience. Most of the time I can do what I need without e.g. classes, but sometimes I'll intentionally use a technique in a project just to learn it. I can't learn stuff if I don't have a use for it.

I'm still learning, so if I've got any part of the above wrong, please help me out.

[–] ericjmorey 1 points 11 months ago (1 children)

"Pirate for it" was probably the wrong phrase. "Plan for it" was probably what you were thinking when your fingers did something else.

[–] [email protected] 3 points 11 months ago

Thought I did so well on my phone. It kept auto correcting code to coffee. Maybe it was telling me something.

Yes, plan for it!

[–] [email protected] 5 points 11 months ago

If you don't already, use version control (git or otherwise) and try to write useful messages for yourself. 99% of the time, you won't need them, but you'll be thankful that 1% of the time. I've seen database engineers hack something together without version control and, honestly, they'd have looked far more professional if we could see recent changes when something goes wrong. It's also great to be able to revert back to a known good state.

Also, consider writing unit tests to prove your code does what you think it does. This is sometimes more useful for code you'll use over and over, but you might find it helpful in complicated sections where your understanding isn't great. Does the function output what it should or not? Start from some trivial cases and go from there.

Lastly, what's the nature of the code? As a developer, I have to live with my decisions for years (unless I switch jobs.) I need it to be maintainable and reusable. I also need to demonstrate this consideration to colleagues. That makes classes and modules extremely useful. If you're frequently writing throwaway code for one-off analyses, those concepts might not be useful for you at all. I'd then focus more on correctness (tests) and efficiency. You might find your analyses can be performed far quicker if you have good knowledge about data structures and algorithms and apply them well. I've personally reworked code written by coworkers to be 10x more efficient with clever usage of data structures. It might be a better use of your time than learning abstractions we use for large, long-term applications.

[–] [email protected] 5 points 11 months ago* (last edited 11 months ago)

Think two things:

optimize the control flow of your code
make it easy to read

You should also be disciplined with these two ideas, your code will look better as you become more experienced, 100% guaranteed.

[–] UFODivebomb 4 points 11 months ago

My advice comes from being a developer, and tech lead, who has brought a lot of code from scientists to production.

The best path for a company is often: do not use the code the scientist wrote and instead have a different team rewrite the system for production. I've seen plenty of projects fail, hard, because some scientist thought their research code is production level. There is a large gap between research code and production. Anybody who claims otherwise is naive.

This is entirely fine! Even better than attempting to build production quality code from the start. Really! Research is solving a decision problem. That answer is important; less so the code.

However, science is science. Being able to reproduce the results the research produced is essential. So there is the standard requirement of documenting the procedure used (which includes the code!) sufficiently to be reproduced. The best part is the reproduction not only confirms the science but produces a production system at the same time! Awws yea. Science!

I've seen several projects fail when scientists attempt to be production developers without proper training and skills. This is bad for the team, product, and company.

(Tho typically those "scientists" fail to at building reproducible systems. So are they actually scientists? I've encountered plenty of phds in name only. )

So, what are your goals? To build production systems? Then those skills will have to be learned. That likely includes OO. Version control. Structural and behavioral patterns.

Not necessary to learn if that isn't your goal! Just keep in mind that if a resilient production system is the goal, well, research code is like the first pancake in a batch. Verify, taste, but don't serve it to customers.

[–] [email protected] 4 points 11 months ago

Check Udemy for courses and wait for a sale. They normally list for hundreds of dollars but routinely (pretty much monthly) for about $10 - $15 dollars.

[–] Andy 3 points 11 months ago

Two books that may be helpful:

Fluent Python by Luciano Ramalho
Python Distilled by David M. Beazley

I'm more familiar with the former, and think it's very good, but it may not give you the basic introduction to object oriented programming (classes and all that) you're looking for; the latter should.

[–] vahtos 3 points 11 months ago (1 children)

This is only tangentially related to improving your code directly as you have asked. However, in a similar vein as using source control (git), when using Python learn to manage your environments. Venv, poetry, conda/mamba, etc are tools to look into.

I used to work with mostly scientists, and a good number of them knew some Python, but none of them knew how to properly manage their environments and it was a huge problem. They would often come to me and say "I ran this script a week ago and it worked, I tried it today without making any changes and it's throwing this error now that I don't understand." Every time it was because they accidentally changed their dependencies, using their global python install. It also made it a nightmare to try to revive old code for them, since there was almost no way to know what version of various libraries were used.

[–] ericjmorey 2 points 11 months ago* (last edited 11 months ago)

This is huge. Unfortunately, as you indicated, there's no standard tool for this and new ones are being added to the mix. Many in the science feilds are pushed towards Conda but I'm not sure it's the best option. However, Conda will be infinitely better than not using anything to manage environments and dependencies.

[–] [email protected] 3 points 11 months ago (3 children)

It's always good to learn new stuff but in terms of productivity: Don't attempt to be a programmer. Rather attempt to write better research code (clean up code, revision control, better commenting, maybe testing...)

Rather try to improve cooperation with programmers, if necessary. Close cooperation, asking stupid questions instead of making assumptions etc. makes the process easy for both of you.

Also don't be afraid to consult different programmers since beyond a certain level, experience and expertise in programming is vastly fragmented.

Experienced programmers mostly suck on your field and vice versa and that's a good thing.

load more comments (3 replies)

[–] [email protected] 3 points 11 months ago

As one physicist to another, the most important thing in the code are long variable names (descriptive) and comments.

S. Wolfram (creator of Mathematica) said something along these lines (paraphrasing) if you are writing real code in Mathematica - you are doing something wrong.

[–] [email protected] 3 points 11 months ago

Learning new programming languages is an awesome way to expand your programming brain. If you want to stay in the same scientific computation niche, you can check out Julia or Mathematica. If you’re just looking to broaden your horizons, the world is your oyster. For me, learning Clojure really cooked my noodle but made me a much better programmer since it taught me functional programming.

Also, just read other peoples code! You can learn the conventions that way. Though for you it would best to find other products within your niche, because I’m not sure if general web dev code would be super helpful.

There are techniques that are broader than any single language’s conventions, and I think learning those are how you can improve. That’s hard to teach, though, and it comes from experience with a few different languages, in my opinion.

And honestly, I can totally respect the “conventions be damned” attitude, because at the end of the day, you’re trying to make something that works, and if nobody else is reading that code, you’ve made the right trade-off.

[–] [email protected] 3 points 11 months ago

I've got two tips to add to the pile you've already read.

I recommend you read the manuals related to what you are using. Have you read the python manual? And the ones for the libraries you use? If you do you'll definitely find something very useful that you didn't know about.

That and, reread your code. Over and over until it makes total sense, and only run it then. It might seem slow, and it'll require patience at first. Running and testing it will always be slower and is generally only useful when testing out the concept you had in mind. But as long as you're doing your conceptual work right, this shouldn't happen often. And so, most work will be spent trying to track down bugs in the implementation of the concept in the code. Trust me when you read your code rigorously you'll immediately find issues. In some cases use temporary prints. Oh and avoid the debugger.

[–] [email protected] 3 points 11 months ago

How to think like a computer scientist may help.

https://www.openbookproject.net/thinkcs/python/english3e/

[–] [email protected] 3 points 11 months ago

You can learn the basics using any beginner's course on udemy.

Then I'd recommend that you build smaller projects to practice. In the future you can change the types of projects you're building to practice even more (rest apis, website frontend, messaging app, low level stuff with C/rust/golang), if you want yo reach this level, of course.

[–] [email protected] 2 points 11 months ago

Use an IDE if you aren't already. Jetbrains stuff is great. Having autocomplete is invaluable.

[–] [email protected] 2 points 11 months ago

As the other commenter said, you want to learn about programming principles. Like, low coupling or don't repeat yourself.

How long is your longest program? What would you say is a typical length?

You say your code is "bad" -- in what ways? For example:

Readability (e.g. going back to it months later so you go "oh I remember" or "wtf does this do?!"
Maintainability (go back to update and you have to totally rework a bunch of stuff for a change that seems like it should be simple)
Reliability (mistakes, haphazard "testing", can't trust output)
Maybe something else?

[–] [email protected] 2 points 11 months ago* (last edited 11 months ago)

Computer scientist here. First, let me dare ask scientists here a question from a friendly fellow: do you have reference to your suggestions?

Code Complete 2 is a book on software engineering with plenty of proper references. Software engineering is important because you learn how to work efficiently. I have been involved in plenty of bad science code projects that wasted tax payers money because of the naivety by the programmers and team management.

The book explains how and why software construction can become expensive and what do about it, covering a vast range of topics agreed by industrial and academic experts.

One caveat, however, is that theories are theories. Even best practices are theories. Often, a young programmer tries to force some practice without checking the reality. You know you can reuse your function to reduce chance of bugs and save time. But have you tested if that is really the case? Nobody can tell unless you test, or ask your member if that's a good idea. I've spent a good chunk of time on refactoring that didn't matter. Yet, some mattered.

That importance of reality check is emphasized in the book Software Architecture: The Hard Parts, for example.

Now, classes, or OOP, have been led by the industry to solve their problems. Often, like in case of Java, it was a partly a solution for a large team. For them it was important to collaborate while reducing the chance of shooting someone accidentally. So, for a scientific project OPP is sometimes irrelevant, and sometimes relevant. Code size is one factor to determine the effectiveness of OOP, but other factors also exist.

Python uses OOP for providing flexibility (here I actually mean polymorphism to be precise), and sometimes it becomes necessary to use this pattern as some packages rely on it.

One problem with Python's OPP is that it inherits implementation. Recent languages seem to avoid this particular type of OOP because the major rival in OOP, what is called composition, has been time-proven to be easier to predict the program's behavior.

To me, writing Python is also often easier with OOP. One popular alternative to OOP is what is called a functional approach, but that is unfortunately not well-supported in Python.

Finally, Automate the Boring Stuff With Python is a great resource on doing routine tasks quickly. Also, pick some Pandas book and get used to its APIs because it improves productivity to a great extent. (I could even cite an article on this! But I don't have the reference at hand.)

Oh, don't forget ChatGPT and Gemini.

load more comments