this post was submitted on 04 Aug 2024
206 points (96.4% liked)

Programming

17536 readers
196 users here now

Welcome to the main community in programming.dev! Feel free to post anything relating to programming here!

Cross posting is strongly encouraged in the instance. If you feel your post or another person's post makes sense in another community cross post into it.

Hope you enjoy the instance!

Rules

Rules

  • Follow the programming.dev instance rules
  • Keep content related to programming in some way
  • If you're posting long videos try to add in some form of tldr for those who don't want to watch videos

Wormhole

Follow the wormhole through a path of communities [email protected]



founded 2 years ago
MODERATORS
 

To accelerate the transition to memory safe programming languages, the US Defense Advanced Research Projects Agency (DARPA) is driving the development of TRACTOR, a programmatic code conversion vehicle.

The term stands for TRanslating All C TO Rust. It's a DARPA project that aims to develop machine-learning tools that can automate the conversion of legacy C code into Rust.

The reason to do so is memory safety. Memory safety bugs, such buffer overflows, account for the majority of major vulnerabilities in large codebases. And DARPA's hope is that AI models can help with the programming language translation, in order to make software more secure.

"You can go to any of the LLM websites, start chatting with one of the AI chatbots, and all you need to say is 'here's some C code, please translate it to safe idiomatic Rust code,' cut, paste, and something comes out, and it's often very good, but not always," said Dan Wallach, DARPA program manager for TRACTOR, in a statement.

you are viewing a single comment's thread
view the rest of the comments
[–] [email protected] 9 points 3 months ago* (last edited 3 months ago) (4 children)

Can someone explain more of the difference between C and Rust to a non programmer?

[–] [email protected] 20 points 3 months ago (2 children)

There is a ton of literature out there, but in a few words:

Rust is built from the ground up with the intention of being safe, and fast. There are a bunch of things you can do when programming that are technically fine but often cause errors. Rust builds on decades of understanding of best practices and forces the developer to follow them. It can be frustrating at first but being forced to use best practices is actually a huge boon to the whole community.

C is a language that lets the developer do whatever the heck they want as long as it's technically possible. "Dereferencing pointer 0?" No problem boss. C is fast but there are many many pitfalls and mildly incorrect code can cause significant problems, buffer overflows for example can open your system to bad actors sending information packets to the program and cause your computer to do whatever the bad actor wants. You can technically write code with that problem in both c and rust, but rust has guardrails that keep you out of trouble.

[–] [email protected] 16 points 3 months ago (2 children)

But if they have fully tested and safe C, and they're converting it to Rust using AI, that seems more dangerous, not less.

[–] [email protected] 4 points 3 months ago (1 children)

Just recently a bug was found in openssh that would let you log into the root user of any machine. With extreme skill and luck of course, but it was possible.

OpenSsh is probably one of the most safe C programs out there with the most eyes on it. Since it's the industry standard to remotely log in into any machine.

There is no such thing as fully tested and safe C. You can only hope that you find the bug before the attacker does. Which requires constant mantainance.

The the about rust is that the code can sit there unchanged and "rust". It's not hard to make a program in 2019 that hasn't needed any maintainance since then, and free of memory bugs.

[–] [email protected] 1 points 3 months ago (1 children)

Just so you know, that bug was a months long hack, probably by a State actor, not just something they didn't spot before.

[–] [email protected] 1 points 3 months ago

It still goes to show that there's no fully tested C code. I'm sure OpenSSH has had the eyes of thousands of security researchers in it. Yet it still has memory-related bugs.

[–] onlinepersona 0 points 3 months ago (2 children)

There is no fully tested and safe C. There's only C that hasn't had a buffer overflow, free after use, ... yet.

It's hyperbole, but the amount of actually tested C without bugs is few and far between. Most C/C++ code doesn't have unit, nor integration tests, and I have barely seen fuzzing (which seems to be the most prominent out there).

Anti Commercial-AI license

[–] [email protected] 2 points 3 months ago (1 children)

free after use

That would be perfectly safe in any language.

[–] onlinepersona 0 points 3 months ago
[–] [email protected] 1 points 3 months ago

Safest C is a Hello World program.

[–] [email protected] 5 points 3 months ago (1 children)

That's a pretty good explanation. So along the same level of explanation, what are these memory problems they are talking about?

[–] [email protected] 13 points 3 months ago (1 children)

I explained a little about buffer overflows, but in essence programming is the act of making a fancy list of commands for your computer to run one after the other.

One concept in programming is an "array" or list of things, sometimes in languages like C the developer is responsible for keeping track of how many items are in a list. When that program accepts info from other programs (like a chat message, video call, website to render, etx) in the form of an array sometimes the sender can send more info than the developer expected to receive.

When that extra info is received it can actually modify the fancy list of commands in such a way that the data itself is run directly on the computer instead of what the developer originally intended.

Bad guy sends too much data, at the end of the data are secret instructions to install a new program that watches every key you type on your keyboard and send that info to the bad guy.

[–] [email protected] 6 points 3 months ago
[–] [email protected] 7 points 3 months ago

In C you can do almost anything, including things that will fry the system. In Rust, it's a lot harder to do that. (This makes sense if you consider when the languages were made and what were made for. It's not an attack on or praise for either language.)

[–] [email protected] 6 points 3 months ago (1 children)

C: Older systems developing language, pretty much industry standard to the point the C-style syntax is often a feature of other languages. Its biggest issues include a massive lack of syntax sugar, such as having to do structTypeFunction(structInstance) rather than structInstance.function() as standard in more modern languages, use of header files and a precompiler (originally invented to get around memory limitations and still liked by hard-core C fans, otherwise disliked by everyone else), and lack of built-in memory safety features, which is especially infamous with its null-terminated strings, often being part of many attack vectors and bugs.

Rust: Newer memory-safe language with functional programming features, most notably const by default, and while it does use curly braces for scopes (code blocks), the general syntax is a bit alien to the C-style of languages. Due to its heavy memory safety features, which also includes a borrow checker, not to mention the functional programming aspects, it's not a drop in replacement language for C to the point you pretty much have too reimplement the algorithms in functional style.

[–] 0x0 1 points 3 months ago (1 children)

I can somewhat see the issue with memory safety, but the other issues are fine by me.

[–] [email protected] 2 points 3 months ago

Even then, D would be a better drop-in replacement, especially in BetterC mode, since it has a currently optional memory safety feature, which is planned to be less optional in a possible Version 3. I personally only have ran into an issue that would have been solved by a "const by default" approach (meaning a function had an unintended side effect, for which the functional approach is to disallow side effects as much as possible), but it would be extra annoying for my own purpose (game development).

The biggest fixer of "unintended side effects" is memory safety, since you won't have memory overwrites.