Decompiling Binary Code with Large Language Models

cm0002@lemdro.id · 20 days ago

Decompiling Binary Code with Large Language Models

breadsmasher@lemmy.world · edit-2 19 days ago

I truly do not understand “AI” (LLMs) as they stand.

I asked Copilot (microsoft) to suggest a way to log from powershell (ms) to application insights (ms)

It straight made up a powershell module, and method call. Completely made up, non existent.

And somehow people are using it for useful decompilation???

edit

Im not even sure I understand what this repos point is. It shows various LLMs performing decompilation… but does it show any level of accuracy? I must be missing something.

does this repo show useful real world decompilation or am I missing something

talkingpumpkin@lemmy.world · 19 days ago

It straight made up a powershell module, and method call. Completely made up, non existent.

It was just imagining the best way to accomplish the task: instead of complaining, you should have just asked it to give you the source code of that new module.

Your lack of faith in AI is hindering your coding ability.

(do I need to add the /s? no, right?)

Prove_your_argument@piefed.social · 19 days ago

Hello Faith Based Code.

We’re going to be going with Faith Based Security next.

Jankatarch@lemmy.world · 19 days ago

With how many faithful AI users on lemmy, you in fact do lmao.

TheFogan@programming.dev · 19 days ago

It straight made up a powershell module, and method call. Completely made up, non existent.

is it wrong, or is it just ahead of it’s time?

Maybe it gave you a perfectly functioning code that will work flawlessly, on windows 18.

Jankatarch@lemmy.world · 19 days ago

People that use chatbots usually don’t know how they work.

Machine Learning models are “predictive.” They use previous data to train, and then predict stock prices or weather next week etc.

LLM chatbots are trained on google and whatsapp messages. When you type hi they “predict” what reply on whatsapp would be.

When you ask how to so something they predict what top result on google would look like. What’s correct or not doesn’t matter.

Yet people using chatbots assume it googles your each question like some kind of scraper and reasons with it’s “language code.”

locuester@lemmy.zip · 19 days ago

people using chatbots assume it googles your each question like some kind of scraper

Most do this in their “thinking” step now. And the results are far better as a result, even up to current events.

limerod@reddthat.com · 19 days ago

LLMs are good at language processing tasks. Asking it to write code or solve complex maths it will make things up. Plus, it takes large amounts of energy to run them. Not to mention the data needed to train them.

Code written by them always has security holes. Use it to find facts, correct grammer or maybe generate a small paragraph or essay. But, don’t use it to generate code, medical device, etc.

As a recent example Chatgpt cannot answer if there is a seahorse emoji. It will get infinitely stuck in trying to be funny and finding an answer. Changing the answer mid token.

MonkderVierte@lemmy.zip · 19 days ago

Chatgpt cannot answer if there is a seahorse emoji

Llama says there is but displays a seashell or a fish, lol. Then a horse and it admits there is none.

sobchak@programming.dev · 19 days ago

If I understand the results tables on repo correctly, their largest model achieves ~68% re-executability rate on code compiled with the q0 optimization flag. I’m unsure if that just tests if the decompiled code can be recompiled and executed, or if the programs need to produce the same result on some test cases. If the model is used to refine Ghidra outputs (I’m guessing this is some well-known decompilation framework) it can be used to achieve ~80% re-executability rate, which is better than Ghidra’s baseline of ~34%.

Jankatarch@lemmy.world · 19 days ago

Why LLM? What was wrong with training a model specifically for decompiling?

MotoAsh@piefed.social · edit-2 19 days ago

LLM is being used in a colloquial way here. It’s just how the algorithm is arranged. Tokenize input, generate output by stacking the most likely subsequent tokens, etc.

It still differentiates it from neural networks and other more basic forms of machine “learning” (god what an anthropomorphized term from the start…).

WolfLink@sh.itjust.works · 19 days ago

They did train a model specifically for decompiling.

zygo_histo_morpheus@programming.dev · 19 days ago

Is the decompiled code guaranteed to be equivalent to the compiled code? While this might be cool it doesn’t seem that useful if you can’t reason about the correctness of the output. I skimmed the README and didn’t manage to figure it out

jacksilver@lemmy.world · 19 days ago

I can’t speak for this specific approach/system, but no. LLMs never really guarantee anything, and for translation roles like this, it’s hard to say how much help they provide. The main issue being that you now have to understand what the LLM generated before you can start fixing it and/or debugging it.

cm0002@lemdro.id · edit-2 19 days ago

From my understanding, it trys to tackle the hardest part, getting from Assembly back to something human readable and not necessarily compilable out the gate

A large part of the tedious and intensive process of decompilation is just figuring out what chunks in ASM do what and working it out to named functions and variables

Beej Jorgensen@lemmy.sdf.org · 19 days ago

Now this is a great use of LLMs. Love it. So many old apps and games exist only in compiled form.

Lucy :3@feddit.org · edit-2 19 days ago

If it actually works.

I’d guess training a model on nothing but C and the resulting ASM would be much better.

WolfLink@sh.itjust.works · 19 days ago

It doesn’t look like it works very well. If I’m reading their results section correctly, it works less than 20% of the time on real world problems.

Lucy :3@feddit.org · 19 days ago

oplkill@lemmy.world · 19 days ago

I don’t get it, how is it better than ghidra? Or it tries to name func, vars and types too, which is hard work

cm0002@lemdro.id · 19 days ago

Or it tries to name func, vars and types too,

It tries to do exactly that, it actually uses ghidra for the initial decompilation

oplkill@lemmy.world · 19 days ago

Mmm, exciting, will it guess global unknown array variables, where god knows where they start/ends? From git example it seems just works in specific functions, not globally the whole code with global variable space

Decompiling Binary Code with Large Language Models

Decompiling Binary Code with Large Language Models

GitHub - albertan017/LLM4Decompile: Reverse Engineering: Decompiling Binary Code with Large Language Models