Vorpal

joined 1 year ago
[–] Vorpal 3 points 7 months ago

The standard library does have some specialisation internally for certain iterators and collection combinations. Not sure if it will optimise that one specifically, but Vec::into_iter().collect::<Vec>() is optimised (it may look silly, but it comes up with functions returning impl Iterator

[–] Vorpal 2 points 7 months ago* (last edited 7 months ago)

Hm, that is a fair point. Perhaps it would make sense to produce a table of checks: indicate which checks each dependency fails/passes, and then colour code them with severity.

Some experimentation on real world code is probably needed. I plan to try this tool on my own projects soon (after I manually verified that your crate match your git code (hah! Bootstrap problem), I already reviewed your code on github and it seemed to do what it claims).

[–] Vorpal 3 points 7 months ago (2 children)

Yes, obviously there are more ways to hide malicious code.

As for the git commit ID, I didn't see you using it even when it was available though? But perhaps that could be a weakness, if the commit ID used does not match the tag in the repo, that would be a red flag too. That could be worth checking.

[–] Vorpal 8 points 7 months ago (4 children)

Due to the recent xz trouble I presume? Good idea, I was thinking about this on an ecosystem wise scale (e.g. all of crates.io or all of a Linux distro) which is a much harder problem to solve.

Not sure if the tag logic is needed though. I thought cargo embedded the commit ID in the published package?

Also I'm amazed that the name cargo-goggles was available.

[–] Vorpal 0 points 8 months ago (1 children)

Please, send an email to [email protected] to report this issue to them, they usually fix things quickly.

[–] Vorpal 2 points 8 months ago

Sure, but my point was that such a C ABI is a pain. There are some crates that help:

  • Rust-C++: cxx and autocxx
  • Rust-Rust: stabby or abi_stable

But without those and just plain bindgen it is a pain to transfer any types that can't easily just be repr(C), and there are quite a few such types. Enums with data for example. Or anything using the built in collections (HashMap, etc) or any other complex type you don't have direct control over yourself.

So my point still stands. FFI with just bindgen/cbindgen is a pain, and lack of stable ABI means you need to use FFI between rust and rust (when loading dynamically).

In fact FFI is a pain in most languages (apart from C itself where it is business as usual... oh wait that is the same as pain, never mind) since you are limited to the lowest common denominator for types except in a few specific cases.

[–] Vorpal 1 points 8 months ago (2 children)

Yes, rust is that much of a pain in this case, since you can only safely pass plain C compatible types across the plugin boundary.

One reason is that rust doesn't have stable layouts of structs and enums, the compiler is free to optimise the to avoid padding by reordering, decide which parts to use as niches for Options etc. And yes, that changes every now and then as the devs come up with new optimisations. I think it changes most recently last summer.

[–] Vorpal 1 points 8 months ago* (last edited 8 months ago)

So there is a couple of options for plugins in Rust (and I haven't tried any of them, yet):

  • Wasm, supposedly https://extism.org/ makes this less painful.
  • libloading + C ABI
  • One of the two stable ABI crates (stabby or abi_stable) + libloading
  • If you want to build them into your code base but not have to update a central list there is linkme and inventory.
  • An embedded scripting language might also be a (very different) option. Something like mlua, rhai or rune.

I don't know if any of these suit your needs, but at least you now have some things to investigate further.

[–] Vorpal 6 points 8 months ago* (last edited 8 months ago) (1 children)

Sounds interesting! As I don't know restic that this is apparently based on, what are the differentiating factors between them? While I'm always on board for a rewrite in Rust in general, I'm curious as to if there is anything more to it than that.

EDIT: seems this is already answered in the FAQ, my bad.

[–] Vorpal 3 points 8 months ago (1 children)

With native code I mean machine code. That is indeed usually produced by C or C++, though there are some other options too, notably Rust and Go both also compile to native machine code rather than some sort of byte code. In contrast Java, C# and Python all compile to various byte code representations (that are usually much higher level and thus easier to figure out).

You could of course also have hand written assembly code, but that is rare these days outside a few specific critical functions like memcpy or media encoders/decoders.

I basically learnt as I went, googling things I needed to figure out. I was goal oriented in this case: I wanted to figure out how some particular drivers worked on a particular laptop so I could implement the same thing on Linux. I had heard of and used ghidra briefly before (during a capture the flag security competition at univerisity). I didn't really want to use it here though to ensure I could be fully in the clear legally. So I focused on tracing instead.

I did in fact write up what I found out. Be warned it is a bit on the vague side and mostly focuses on the results I found. I did plan a followup blog post with more details on the process as well as more things I figured out about the laptop, but never got around to it. In particular I did eventually figure out power monitoring and how to read the fan speed. Here is a link if you are interested to what I did write: https://vorpal.se/posts/2022/aug/21/reverse-engineering-acpi-functionality-on-a-toshiba-z830-ultrabook/

[–] Vorpal 7 points 8 months ago* (last edited 8 months ago) (3 children)

The term you are looking for in general is "reverse engineering". For software in particular you are looking at disassembly, decompilation and various forms of tracing and debugging.

As for particular software: For .NET there is ILSpy that can help you look into how things work. For native code I have used Ghidra in the past.

Native code is a lot more effort to understand. In both cases things like variable names names will be gone. Most function names will be missing (even more so for native code). Type names too. For native code the types themselves will be gone, so you will have to look at what is going on and guess if something is a struct or an array. How big is the struct and what are the fields?

Left over debug or logging lines are very valuable in figuring out what something is. Often times you have to go over a piece of disassembly or decompiled code several times as your understanding of it gradually builds.

C++ code with lots of object orientation tends to be easier to figure out the big picture of than C code, as the classes and inheritance provides a more obvious pattern.

Then there is dynamic tracing (running under some sort of debugger or call tracer to see what the software does). I have not had as much success with this.

Note that I'm absolutely an amateur at reverse engineering. I thought it was interesting enough that I wanted to learn it (and I had a small project where it was useful). But I'm mostly a programmer.

I have done a lot of low level programming (C, C++, even a small amount of assembly, in recent times a lot of Rust), and this knowledge helps when reverse engineering. You need to understand how compilers and linkers lowers code to machine code in order to have a fighting chance at reversing that.

Also note that there may be legal complications when doing reverse engineering, especially with regards to how you make use of the things you learned. I'm not a lawyer, this is not legal advice, etc. But check out the legal guidelines of Asahi Linux (who are working on reverse engineering M1 macs to run Linux on them): https://asahilinux.org/copyright/ (scroll down to "reverse engineering policy").

Now this covers (at a high level) how to figure things out. How you then patch closed source software I have no idea. Haven't looked into that, as my interest was in figuring out how hardware and drivers worked to make open source software talk to said hardware.

[–] Vorpal 14 points 8 months ago

I have read it, it is a very good book, and the memory ordering and atomics sections are also applicable to C and C++ since all of these languages use the same memory ordering model.

Can strongly recommend it if you want to do any low level concurrency (which I do in my C++ day job). I recommended it to my colleagues too whenever they had occasion to look at such code.

I do wish there was a bit more on more obscure and advanced patterns though. Things like RCU, seqlocks etc basically get an honorable mention in chapter 10.

 

cross-posted from: https://programming.dev/post/10657765

I made a replacement for the venerable paccheck. It checks if files managed by the package manger have changed and if so reports that back to the user. Unlike paccheck it is cross distro (supports Debian too and could be further extended), and it uses all your CPU cores to be as fast as possible.

Oh and it is written in Rust (that may be a plus or minus depending on your opinion, but it wouldn't have happened at all in any language except Rust, and Rust makes it very easy to add this sort of parallelism).

There are more details (including benchmarks) in the readme on github. Maybe it is useful to some of you.

(The main goal of this project is not actually the program produced so far, but to continue building this into a library. I have a larger project in the planning phase that needs this (in library form) as part of it.)

 

This is a Rust replacement for debsums (on Debian/Ubuntu/...) and paccheck (on Arch Linux and derivatives). It is much faster than those thanks to using all your CPU cores in parallel. What it does is check files installed by your package manager for changes and reports those on stdout.

This is a project I have been working on over the past few weeks. There are more details (including benchmarks) in the readme.

I normally don't advertise my open source projects (having users other than yourself is both a blessing and a curse), but since there was recent discussion on how to grow this lemmy group I'd thought I'd post it. Maybe it is useful to some of you.

I also spent quite some time on optimising this (including a lot of benchmarking, profiling and trying alternative solutions). In the end I'm happy with the performance, though I am considering io-uring for disk IO.

The main goal of this project is not actually the program produced so far, but to continue building this into a library (currently very little is exposed as pub, because the API will change). I have a larger project in the planning phase that needs this (in library form) as part of it.

 

I'm not affiliated with the site, but I found this interesting. Especially the quite nuanced discussion about if rust is hard or not.

With my background in systems level safety critical hard realtime C++ (plus a bunch of functional programming as a hobby) I feel that the answer was no (for me personally). Basically learn about lifetimes and borrowing and then learn the syntax, done. (Async and unsafe are arguably harder, but again I had the requisite background for it to be mostly familiar, though haven't needed to write much unsafe yet.)

But it was very interesting hearing the other perspective as well! Why rust might feel hard if you have a background in JS/Python/Go etc.

And it was awesome to hear such a nuanced discussion on the Internet, that is truly a rare thing these days.

17
New features on lib.rs (users.rust-lang.org)
 

cross-posted from: https://programming.dev/post/1825728

Lots of new features!

Thought I should share this with those who don't use users.rust-lang.org. Note: I'm not affiliated with lib.rs, I'm only reposting to lemmy.

15
New features on lib.rs (users.rust-lang.org)
submitted 1 year ago by Vorpal to c/rust
 

Lots of new features!

Thought I should share this with those who don't use users.rust-lang.org. Note: I'm not affiliated with lib.rs, I'm only reposting to lemmy.

view more: next ›