2

Okay, so I am looking for a way to safely run a randomly generated binary code. I also need to be able to decompile the code. Any ideas and all programming languages are welcome.

BTW it most be binary code; byte code or source code will not work. This is a research project, so I can't go adding variables.

amir
  • 29
  • 2
  • 6
    Strange sounding project... why not set up a virtual machine and just run the code in that? If it crashes, you just rest it. – FrustratedWithFormsDesigner Oct 22 '13 at 01:40
  • an emulator will do – ratchet freak Oct 22 '13 at 07:33
  • 4
    This just screams out X/Y problem. You might want a little more background on how you convinced yourself that you need to execute randomly generated binaries. Knowing how you plan to observe them & what results you're looking for is also an important aspect of any possible solution. – Sean McSomething Oct 23 '13 at 23:28

2 Answers2

3

From the looks of it, you want to do something similar to what malware researchers do for malware analysis.

As far as safely running your random binaries, FrustratedWithFormsDesigner's comment covers it well.

Just run it on a virtual machine to keep any risk from spilling into your computer. There are lots of free virtual machine out there like Virtual Box or some free tier of VMWare or Parallels

But if you want to know more about how to confine or sandbox the binaries, you can also read more here: http://www.porcupine.org/forensics/forensic-discovery/chapter6.html

As for decompiling, there is also some mention of that in the previous article. You can also look-up reverse engineering which might help.

Though just out of curiosity, do you mind if I ask what is the research project about? Forensics or Reverse Engineering or ???

Maru
  • 1,402
  • 10
  • 19
  • 2
    All man made programs,once compiled into assembly language, follow a normal distribution curve(bell curve) in the distribution of 1's. The code will not truly be random; it will randomly follow a bell curve. I am going to decompile it, and see if I can make sense of it. – amir Oct 22 '13 at 02:09
  • @amir: Do you have any evidence for that claim? It sounds suspect. Are you graphing the total number of 1's, normalized for total program length, or what? – whatsisname Oct 22 '13 at 02:23
  • 1
    It is part of the Central Limit Theorem. It is way more complicated than I put it; what it basically means is man made programs are similar to each other(I will have to use a large sample of existing programs to make the curve). [link](http://en.wikipedia.org/wiki/Central_limit_theorem) – amir Oct 22 '13 at 02:36
  • 1
    My take on information theory leads me to think that the distribution should be "fairly" flat. Unless the people at Intel et al got their instruction design wrong. They try to optimize instruction fetch bandwidth etc. – andy256 Oct 22 '13 at 02:52
  • 1
    @andy256 I don't think we are on the same page. This shouldn't have any impact on processing. But after thinking about it, this has a problem ether way; I have no way of getting a large number of programs with the same number of bits. – amir Oct 22 '13 at 03:05
0

So you are going to randomly generate code and run it?

The way I see it, your language choices are:

  1. binary (straight machine instructions)
  2. any higher level language

If (2) is randomly generated, it probably simply won't compile. Assuming your code isn't truly random and you get it to compile, most likely you'll just end up with exceptions because of all the bad data being passed around.

So that leaves you with just (1). You have a much higher probability (still not 100%) that you'll be able to string together a bunch of "valid-looking" machine instructions that might run. But then again, 99.99% of your randomly generated machine instructions won't even get past 4 or 5 of them.

If you want to prove this for yourself. I'd say forget trying to prevent damage. There won't be any. Just compile any C++ program, put a breakpoint in main function, open disassembly window and pave over memory with random data and just see how far your program gets.

The only protection you need is already provided for you by any modern OS that supports virtual memory. A normal user process (i.e. one that isn't running in kernel mode) cannot do any damage to another process or the OS itself. The worst it can do is kill itself while executing invalid instructions. I think for your experiment, that should be all protection you ever need.

DXM
  • 19,932
  • 4
  • 55
  • 85
  • It can't do any damage? While the probabilities are low, it just might call the OS function to format your hard drive. Or worse. – Kris Vandermotten Oct 24 '13 at 09:35
  • @KrisVandermotten - true, but that's only 10 times more likely than it using OS functions to connect up to a lottery website and buy you a winning ticket. I don't mean the jackpot, but you know... something much smaller like $10 or $25 – DXM Oct 24 '13 at 14:46