Trying to understand P vs NP vs NP Complete vs NP Hard

Question

I am trying to understand these classifications and why they exist. Is my understanding right? If not, what?

P is polynomial complexity, or O(n^k) for some non-negative real number k, such as O(1), O(n^1/2), O(n²), O(n³), etc. If a problem belongs to P, then there exists at least one algorithm that can solve it from scratch in polynomial time. For example I can always figure out if some integer n is prime by looping over 2 <= k <= sqrt(n) and checking at each step if k divides n.
NP is non-deterministic polynomial complexity. I don't really know what it means for it to be non-deterministic. I think it means it is easy to verify in polynomial time, but may or may not be polynomial time to solve from scratch if we didn't already know the answer. Since it may be solvable in polynomial time, all P problems are also NP problems. Integer factorization gets quoted as an example of NP, but I don't understand why it's not P, personally, since trial factorization takes O(sqrt(n)) time.
NP-Complete I don't understand at all, but the Traveling Salesman Problem is quoted as an example of this. But in my opinion the TSP problem might just be NP, because it takes something like O(2ⁿ n²) time to solve, but O(n) to verify if you are given the path up front.
NP-Hard I assume is just full of unknowns. Hard to verify, hard to solve.

Have you read the question on CS.SE [What is the definition of P, NP, NP-complete and NP-hard?](http://cs.stackexchange.com/q/9556/10159)? — , Jan 24 '16 at 00:31
That CS.SE answer is quite awe-inspiring, but I think it's possible to give a very concise and non-misleading explanation of what these terms mean without going into nearly so much detail. @Nakano would be interested in a shorter, "to the point" answer or does that CS.SE post solve your problem? — Ixrec, Jan 24 '16 at 00:53
@MichaelT I read through that link and found it really verbose and not very clear on several points. I feel like it just gave me more questions than answers. — Nakano, Jan 24 '16 at 01:09
@lxrec Absolutely, I could use a to-the-point answer. Helps to have a basic understanding so the higher-level stuff will make more sense in context. — Nakano, Jan 24 '16 at 01:10
"non-deterministic" can be interpreted as "given a choice the computer chooses the correct choice every time". — Thorbjørn Ravn Andersen, Nov 18 '18 at 12:06

score 62 · Accepted Answer · edited Mar 04 '20 at 19:59

62

You're basically correct about P and NP, but not about NP-hard and NP-complete.

For starters, here are the super-concise definitions of the four complexity classes in question:

P is the class of decision problems which can be solved in polynomial time by a deterministic Turing machine.
NP is the class of decision problems which can be solved in polynomial time by a non-deterministic Turing machine. Equivalently, it is the class of problems which can be verified in polynomial time by a deterministic Turing machine.
NP-hard is the class of decision problems to which all problems in NP can be reduced to in polynomial time by a deterministic Turing machine.
NP-complete is the intersection of NP-hard and NP. Equivalently, NP-complete is the class of decision problems in NP to which all other problems in NP can be reduced to in polynomial time by a deterministic Turing machine.

And here's a Euler diagram from Wikipedia showing the relationships between these four classes (assuming that P is not equal to NP):

The part that I assume you're most unfamiliar with or confused by is the notion of a "polynomial time reduction" from problem X to problem Y. A reduction from X to Y is simply an algorithm A which solves X by making use of some other algorithm B which solves problem Y. This reduction is called a "polynomial time reduction" if all parts of A other than B have a polynomial time complexity. As a trivial example, the problem of finding the smallest element in an array is constant-time reducible to the sorting problem, since you can sort the array and then return the first element of the sorted array.

One thing that's easy to miss about the NP-hard definition is that the reduction goes from NP problems to the NP-hard problem, but not necessarily vice versa. This means that NP-hard problems might be in NP, or in a much higher complexity class (as you can see from the Euler diagram), or they might not even be decidable problems. That's why people often say something like "NP-hard means at least as hard as NP" when trying to explain this stuff informally.

The halting problem is a good example of an NP-hard problem that's clearly not in NP, as Wikipedia explains:

It is easy to prove that the halting problem is NP-hard but not NP-complete. For example, the Boolean satisfiability problem can be reduced to the halting problem by transforming it to the description of a Turing machine that tries all truth value assignments and when it finds one that satisfies the formula it halts and otherwise it goes into an infinite loop. It is also easy to see that the halting problem is not in NP since all problems in NP are decidable in a finite number of operations, while the halting problem, in general, is undecidable.

edited Mar 04 '20 at 19:59

Hasan Mansoor

3
2

answered Jan 24 '16 at 02:09

Ixrec

27,621
15
80
87

1

I don't understand how you "reduce" a problem to NP-Hard when you're increasing its complexity – Nakano Jan 24 '16 at 02:47
5

@Nakano Intuitively, it's a "reduction" in the sense that one problem is being made a subproblem of some other problem. The fact that some of these reductions increase complexity instead of decreasing it through poor choice of "subproblem" simply means you would never use these reductions in any real world code. Though to be honest NP-hard does strike me as a weird and not terribly interesting class; it may be more fruitful to ignore it and just think about NP-complete as the set of NP problems that all other NP problems reduce to. – Ixrec Jan 24 '16 at 02:50
Why is something like integer factorization considered NP and not P if I can factorize $n$ in $O(\sqrt{n})$ time? – Nakano Jan 24 '16 at 02:59
3

@Nakano http://stackoverflow.com/questions/12637582/why-is-integer-factorization-a-non-polynomial-time I believe the short answer is that when people talk about integer factorization being NP they're normally talking about really huge integers, for which you generally start doing your big-O proofs with n as "the number of bits the integer takes up in memory" instead of "the number of integers you passed into the function". – Ixrec Jan 24 '16 at 03:03
$n$ in this case is the integer to be factored (I'm not trying to factorize $n$ integers) – Nakano Jan 24 '16 at 03:04
1

@Nakano It would probably be worth asking a new question specifically about this integer factorization thing if the SO question I linked and my comment were not enough to resolve that issue for you. – Ixrec Jan 24 '16 at 03:06
I did here http://math.stackexchange.com/questions/1624390/why-isnt-integer-factorization-in-complexity-p but so far nothing has clicked – Nakano Jan 24 '16 at 03:07
@Nakano The `$n$` and similar - we don't have mathjax enabled here. While we do answer questions about algorithms, if you start wanting to use math characters, well, Math or CS would probably be a better pool of people for those questions. – Jan 24 '16 at 03:45
2

@Nakano: In big-O notation, the `n` is a measure for the size of the input (number of elements, bytes, digits, etc.), not the value of the input. – Bart van Ingen Schenau Jan 24 '16 at 09:26
1

@BartvanIngenSchenau But can't I say that a "value" can be translated into a "number of elements"? For example if I ask you to factorize 1000, you might say this is a problem of size 4 with respect to length, but I could also say it's a problem of size 1000 with respect to value, since we can represent it as a list [1, 2, 3, ..., 1000], and based on this list we only need to iterate to sqrt(1000) items, so therefore it is O(sqrt(n)) time – Nakano Jan 24 '16 at 14:49
4

@Nakano The short answer is that you're all right, and this is why when doing time complexity analaysis *you always need to specify what n means*. The claim that n is "the size of the input" is merely a concise summary of how we normally choose to define n. It's not part of the rigorous definitions of big-O notation or time complexity. I believe you are correct to say that integer factorization is O(sqrt(n)) when n is the *value* of the input. It just so happens that the complexity results where n means size are usually a lot more useful in practice than the ones where n means value. – Ixrec Jan 24 '16 at 15:23
Okay, that makes a lot more sense to me if it's defined that way just because it happens to be more useful in practice – Nakano Jan 24 '16 at 15:25
2

@Nakano It's also worth keeping in mind that *technically* you also have to define the time complexity of your primitive operations (addition, multitplication, reading from memory, writing to memory, comparing two numbers). Most of the time we either assume all of these primitives are constant or we only count one kind of operation (e.g., for sorting algorithms we traditionally count the comparisons). I suspect the results for integer factorization don't assume all of these operations are constant time like we usually do, otherwise the size of the number wouldn't matter very much. – Ixrec Jan 24 '16 at 15:28
@Ixrec Okay, trying to start simple here: Assuming n always refers to length / countable elements (such as in graph theory problems), and that we are using a Turing machine to solve these problems: P means the problem can be solved/verified in polynomial time. NP means it can be verified in polynomial time if we're given the answer up front, but may or may not have a polynomial-time solver. Is ALL of this correct so far? – Nakano Jan 24 '16 at 15:56
I'd invite you to chat at this point but unfortunately you don't have 20 rep yet, so we'll have to wait for the "would you like to move this to chat" message to get triggered. But what you just said about P and NP is completely correct. My sole nitpick would be that "solved/verified" could easily be interpreted as "solved **or** verfied", which would obviously be wrong. – Ixrec Jan 24 '16 at 16:14
Let us [continue this discussion in chat](http://chat.stackexchange.com/rooms/34797/discussion-between-nakano-and-ixrec). – Nakano Jan 24 '16 at 16:20
what is the bound of difference between NP and NP hard,O(x^n) can belong to any of the too ? What about O(n)? The answer answers about differences but I am more interested form the O() complexity point of view – partizanos Feb 03 '19 at 22:31
There is no single time complexity bound between NP and NP-hard. That's just not how NP-hard is defined. There's really no explanation I can give beyond repeating the definitions again. O(n) is clearly in P because n is a trivial polynomial. O(n^x) is a slightly less trivial polynomial but still clearly P. But O(x^n) is exponential time, and the class of all problems with exponential time solutions is usually called EXPTIME and is a superset of NP. See https://en.wikipedia.org/wiki/EXPTIME for more on that. – Ixrec Feb 05 '19 at 00:11
Your third and fourth definitions listed fail to distinctuish the difference btween NP-Hard and NP-Complete – Cybernetic Dec 26 '19 at 20:40

score 12 · Answer 2 · edited Jan 24 '16 at 03:50

12

Integer factorization gets quoted as an example of NP, but I don't understand why it's not P, personally, since trial factorization takes O(sqrt(n)) time.

For the purposes of complexity classes, the n is the length of the input. So if you want to factor integer k, n is not k but log k, the number of bits (or whatever) it takes to write down the number. So integer factorization is O(sqrt(k)) as you say, but this is O(sqrt(2ⁿ)) which is O(2^(n/2)).

NP-Hard I assume is just full of unknowns. Hard to verify, hard to solve.

No. NP-Hard is merely about how hard a problem is to solve.

NP-Hard problems are at least hard as the hardest problem in NP. We know they are at least that hard, because if we had a polynomial-time algorithm for an NP-Hard problem, we could adapt that algorithm to any problem in NP.

NP-Complete I don't understand at all

NP-Complete means that a problem is both NP and NP-Hard. It means that we can verify a solution quickly (NP), but its at least as hard as the hardest problem in NP (NP-Hard).

I don't really know what it means for it to be non-deterministic.

Non-determinism is an alternative definition of NP. A non-deterministic turing machine effectively is able to duplicate itself at any time, and have each duplicate take a different execution path. Under this definition, NP is the set of the problems that can be solved in polynomial time by a computer than can freely duplicate itself. It turns out this is exactly the same set of problems that can be verified in polynomial time.

edited Jan 24 '16 at 03:50

answered Jan 24 '16 at 00:56

Winston Ewert

24,732
12
72
103

So it is possible for $O(n^k)$ time algorithms to be NP problems? – Nakano Jan 24 '16 at 01:12
1

`k` is a constant real number? Yes. All P problems are also NP problems. Obviously, anything you can solve in polynomial time can also be verified in polynomial time. – Winston Ewert Jan 24 '16 at 01:15
How is length/size actually defined here? For example I could just write $n$ in a large base and decrease its length when written. What about problems that don't explicitly deal with integers, but say graphs with $V$ vertices and $E$ edges, etc. – Nakano Jan 24 '16 at 01:15
@Nakano, actually a large base wouldn't change it, because it would only be a constant factor difference. So it wouldn't effect polynomial vs non-polynomial. However, if you wrote the number in unary, then it would change it. – Winston Ewert Jan 24 '16 at 01:20
Looking over this answer again, I apologize but I am not sure if this clarifies things for me. NP hard is at least as hard as the hardest in NP complete, but then NP complete is defined as both NP and NP-Hard... to a newbie like me this feels circular. For NP are you saying that if we had, say, unlimited clusters, and divvied the work up across each computer, at which point each one would finish in polynomial time? I think it might help if some basic examples were added to each category and why they aren't in some other category. I need some ELI5ness for this. – Nakano Jan 24 '16 at 01:22
In practice, we typically assume that you've chosen a reasonably compact representation. In theory, checking for integer factorization of a unary number is a different problem than the integer factorization of a binary number. – Winston Ewert Jan 24 '16 at 01:24
2

@Nakano, hmm... I wouldn't dare try to explain complexity classes to a five year old. :P – Winston Ewert Jan 24 '16 at 01:26
NP-hard is the hardest problem in NP, not NP-complete. – Winston Ewert Jan 24 '16 at 01:26

5gon12eder · Answer 3 · 2016-01-24T20:58:53.950

The first thing to understand is that P and NP classify languages, not problems. To understand what this means, we need some other definitions first.

An alphabet is a non-empty finite set of symbols.

{0, 1} is an alphabet as is the ASCII character set. {} is not an alphabet because it is empty. N (the integers) is not an alphabet because it is not finite.

Let Σ be an alphabet. An ordered concatenation of a finite number of symbols from Σ is called a word over Σ.

The string 101 is a word over the alphabet {0, 1}. The empty word (often written as ε) is a word over any alphabet. The string penguin is a word over the alphabet containing the ASCII characters. The decimal notation of the number π is not a word over the alphabet {., 0, 1, 2, 3, 4, 5, 6, 7, 8, 9} because it is not finite.

The length of a word w, written as |w|, is the number of symbols in it.

For example, |hello| = 5 and |ε| = 0. For any word w, |w| ∈ N and therefore finite.

Let Σ be an alphabet. The set Σ^＊ contains all words over Σ, including ε. The set Σ⁺ contains all words over Σ, excluding ε. For n ∈ N, Σⁿ is the set of words of length n.

For every alphabet Σ, Σ^＊ and Σ⁺ are infinite countable sets. For the ASCII character set Σ_ASCII, the regular expressions .* and .+ denote Σ_ASCII^＊ and Σ_ASCII⁺ respectively.

{0, 1}⁷ is the set of 7-bit ASCII codes {0000000, 0000001, …, 1111111}. {0, 1}³² is the set of 32 bit integer values.

Let Σ be an alphabet and L &subseteq; Σ^＊. L is called a language over Σ.

For an alphabet Σ, the empty set and Σ^＊ are trivial languages over Σ. The former is often referred to as the empty language. The empty language {} and the language containing only the empty word {ε} are different.

The subset of {0, 1}³² that corresponds to non-NaN IEEE 754 floating point values is a finite language.

Languages can have an infinite number of words but every language is countable. The set of strings {1, 2, …} denoting the integers in decimal notation is an infinite language over the alphabet {0, 1, 2, 3, 4, 5, 6, 7, 8, 9}. The infinite set of strings {2, 3, 5, 7, 11, 13, …} denoting the prime numbers in decimal notation is a proper subset thereof. The language containing all words matching the regular expression [+-]?\d+\.\d*([eE][+-]?\d+)? is a language over the ASCII character set (denoting a subset of the valid floating-point expressions as defined by the C programming language).

There is no language containing all real numbers (in any notation) because the set of real numbers is not countable.

Let Σ be an alphabet and L &subseteq; Σ^＊. A machine D decides L if for every input w &in; Σ^＊ it computes the characteristic function χ_L(w) in finite time. The characteristic function is defined as
χ_L: Σ^＊ → {0, 1}
    w  ↦ 1,  w ∈ L
         0,  otherwise.
Such a machine is called a decider for L. We write “D(w) = x” for “given w, D outputs x”.

There are many machine models. The most general one that is in practical use today is the model of a Turing machine. A Turing machine has unlimited linear storage clustered into cells. Each cell can hold exactly one symbol of an alphabet at any point in time. The Turing machine performs its computation as a sequence of computation steps. In each step, it can read one cell, possibly overwrite its value and move the read/write head by one position to the left or right cell. What action the machine will perform is controlled by a finite state automaton.

A random access machine with a finite set of instructions and unlimited storage is another machine model that is as powerful as the Turing machine model.

For the sake of this discussion, we shall not bother us with the precise machine model we use but rather suffice to say that the machine has a finite deterministic control unit, unlimited storage and performs a computation as a sequence of steps that can be counted.

Since you've used it in your question, I assume that you're already familiar with “big-O” notation so here is only a quick refresher.

Let f: N → be a function. The set O(f) contains all functions g: N → N for which there exist constants n₀ ∈ N and c ∈ N such that for every n ∈ N with n > n₀ it is true that g(n) ≤ c f(n).

Now we are prepared to approach the real question.

The class P contains all languages L for which there exists a Turing machine D that decides L and a constant k ∈ N such that for every input w, D halts after at most T(|w|) steps for a function T ∈ O(n ↦ n^k).

Since O(n ↦ n^k), while mathematically correct, is inconvenient to write and read, most people – to be honest, everybody except myself – usually writes simply O(n^k).

Note that the bound depends on the length of w. Therefore, the argument you make for the language of the primes is only correct for numbers in unaray encodings, where for the encoding w of a number n, the length of the encoding |w| is proportional to n. Nobody would ever use such an encoding in practice. Using a more advanced algorithm than simply trying all possible factors, it can be shown, however, that the language of prime numbers remains in P if the inputs are encoded in binary (or to any other base). (Despite massive interest, this could only be proven by Manindra Agrawal, Neeraj Kayal, and Nitin Saxena in an award-winning paper in 2004 so you can guess that the algorithm is not very simple.)

The trivial languages {} and Σ^＊ and the non-trivial language {ε} are obviously in P (for any alphabet Σ). Can you write functions in your favorite programming language that take a string as input and return a boolean telling whether the string is a word from the language for each of these and prove that your function has polynomial run-time complexity?

Every regular language (a language described by a regular expression) is in P.

Let Σ be an alphabet and L &subseteq; Σ^＊. A machine V that takes an encoded tuple of two words w, c ∈ Σ^＊ and outputs 0 or 1 after a finite number of steps is a verifier for L if it has the following properties.

Given (w, c), V outputs 1 only if w ∈ L.

For every w ∈ L, there exists a c ∈ Σ^＊ such that V(w, c) = 1.

The c in the above definition is called a witness (or certificate).

A verifier is allowed to give false negatives for the wrong witness even if w actually is in L. It is not, however, allowed to give false positives. It is also required that for each word in the language, there exists at least one witness.

For the language COMPOSITE, that contains the decimal encodings of all integers that are not prime, a witness could be a factorization. For example, (659, 709) is a witness for 467231 ∈ COMPOSITE. You can easily verify that on a sheet of paper while without the witness given, proving that 467231 is not prime would be difficult without using a computer.

We didn't say anything about how an appropriate witness can be found. This is the non-deterministic part.

The class NP contains all languages L for which there exists a Turing machine V that verifies L and a constant k ∈ N such that for every input (w, c), V halts after at most T(|w|) steps for a function T ∈ O(n ↦ n^k).

Note that the above definition implies that for each w ∈ L there exists a witness c with |c| ≤ T(|w|). (The Turing machine cannot possibly look at more symbols of the witness.)

NP is a superset of P (why?). It is not known whether there exist languages that are in NP but not in P.

Integer factorization is not a language per se. However, we can construct a language that represents the decision problem associated with it. That is, a language that contains all tuples (n, m) such that n has a factor d with d &leq; m. Let us call this language FACTOR. If you have an algorithm to decide FACTOR, it can be used to compute a full factorization with only polynomial overhead by performing a recursive binary search for each prime factor.

It is easy to show that FACTOR is in NP. An appropriate witness would simply be the factor d itself and all the verifier would have to do is verify that d &leq; m and n mod d = 0. All of this can be done in polynomial time. (Remember, again, that it is the length of the encoding that counts and that is logarithmic in n.)

If you can show that FACTOR is also in P, you can be sure to get many cool awards. (And you have broken a significant portion of today's cryptography.)

For every language in NP, there is a brute-force algorithm that decides it deterministically. It simply performs an exhaustive search over all witnesses. (Note that the maximum length of a witness is bounded by a polynomial.) So, your algorithm to decide PRIMES was actually a brute-force algorithm to decide COMPOSITE.

To address your final question, we need to introduce reduction. Reductions are a very powerful concept of theoretical computer science. Reducing one problem to another basically means solving one problem by means of solving another problem.

Let Σ be an alphabet and A and B be languages over Σ. A is polynomial-time many-one reducible to B if there exists a function f: Σ^＊ → Σ^＊ with the following properties.

w ∈ A ⇔ f(w) ∈ B for all w ∈ Σ^＊.

The function f can be computed by a Turing machine for every input w in a number of steps bounded by a polynomial in |w|.

In this case, we write A ≤_p B.

For example, let A be the language that contains all graphs (encoded as adjacency matrix) that contain a triangle. (A triangle is a cycle of length 3.) Let further B be the language that contains all matrices with non-zero trace. (The trace of a matrix is the sum of its main diagonal elements.) Then A is polynomial-time many-one reducible to B. To prove this, we need to find an appropriate transformation function f. In this case, we can set f to compute the 3^rd power of the adjacency matrix. This requires two matrix-matrix products, each of which has polynomial complexity.

It is trivially true that L ≤_p L. (Can you prove it formally?)

We'll apply this to NP now.

A language L is NP-hard if and only if L' ≤_p L for every language L' ∈ NP.

An NP-hard language may or may not be in NP itself.

A language L is NP-complete if and only if

L ∈ NP and

L is NP-hard.

The most famous NP-complete language is SAT. It contains all boolean formulas that can be satisfied. For example, (a ∨ b) ∧ (¬a ∨ ¬b) ∈ SAT. A valid witness is {a = 1, b = 0}. The formula (a ∨ b) ∧ (¬a ∨ b) ∧ ¬b ∉ SAT. (How would you prove that?)

It is not difficult to show that SAT ∈ NP. To show the NP-hardness of SAT is some work but it was done in 1971 by Stephen Cook.

Once that one NP-complete language was known, it was relatively simple to show the NP-completeness of other languages via reduction. If language A is known to be NP-hard, then showing that A ≤_p B shows that B is NP-hard, too (via the transitivity of “≤_p”). In 1972 Richard Karp published a list of 21 languages that he could show were NP-complete via (transitive) reduction of SAT. (This is the only paper in this answer that I actually recommend you should read. Unlike the others, it is not hard to understand and gives a very good idea of how proving NP-completeness via reduction works.)

Finally, a short summary. We'll use the symbols NPH and NPC to denote the classes of NP-hard and NP-complete languages respectively.

P &subseteq; NP
NPC &subset; NP and NPC &subset; NPH, actually NPC = NP ∩ NPH by definition
(A ∈ NP) ∧ (B ∈ NPH) ⇒ A ≤_p B

Note that the inclusion NPC &subset; NP is proper even in the case that P = NP. To see this, make yourself clear that no non-trivial language can be reduced to a trivial one and there are trivial languages in P as well as non-trivial languages in NP. This is a (not very interesting) corner-case, though.

Addendum

Your primary source of confusion seems to be that you were thinking of the “n” in “O(n ↦ f(n))” as the interpretation of an algorithm's input when it actually refers to the length of the input. This is an important distinction because it means that the asymptotic complexity of an algorithm depends on the encoding used for the input.

This week, a new record for the largest known Mersenne prime was achieved. The largest currently known prime number is 2^{74 207 281} − 1. This number is so huge that it gives me a headache so I'll use a smaller one in the following example: 2³¹ – 1 = 2 147 483 647. It can be encoded in different ways.

by its Mersenne exponent as decimal number: 31 (2 bytes)
as decimal number: 2147483647 (10 bytes)
as unary number: 11111…11 where the … is to be replaced by 2 147 483 640 more 1s (almost 2 GiB)

All these strings encode the same number and given any of these, we can easily construct any other encoding of the same number. (You can replace decimal encoding with binary, octal or hexadecimal if you want to. It only changes the length by a constant factor.)

The naive algorithm for testing primality is only polynomial for unary encodings. The AKS primality test is polynomial for decimal (or any other base b ≥ 2). The Lucas-Lehmer primality test is the best known algorithm for Mersenne primes M_p with p an odd prime but it is still exponential in the length of the binary encoding of the Mersenne exponent p (polynomial in p).

If we want to talk about the complexity of an algorithm, it is very important that we are very clear what representation we use. In general, one can assume that the most efficient encoding is used. That is, binary for integers. (Note that not every prime number is a Mersenne prime so using the Mersenne exponent is not a general encoding scheme.)

In theoretical cryptography, many algorithms are formally passed a completely useless string of k 1s as the first parameter. The algorithm never looks at this parameter but it allows it to formally be polynomial in k, which is the security parameter used to tune the security of the procedure.

For some problems for which the decision language in binary encoding is NP-complete, the decision language is no longer NP-complete if the encoding of embedded numbers is switched to unary. The decision languages for other problems remain NP-complete even then. The latter are called strongly NP-complete. The best known example is bin packing.

It is also (and perhaps more) interesting to see how the complexity of an algorithm changes if the input is compressed. For the example of Mersenne primes, we have seen three encodings, each of which is logarithmically more compressed than its predecessor.

In 1983, Hana Galperin and Avi Wigderson have written an interesting paper about the complexity of common graph algorithms when the input encoding of the graph is compressed logarithmically. For these inputs, the language of graphs containing a triangle from above (where it was clearly in P) suddenly becomes NP-complete.

And that's because language classes like P and NP are defined for languages, not for problems.

This answer is probably not useful for the level of understanding of the asker. Read the other answers and see what Nanako is struggling with. Do you think this answer will help him/her? — Andres F., Jan 25 '16 at 15:24
This answer may not help OP, but certainly helps other readers (myself included). — Gabriel, Apr 24 '19 at 17:24
very helpful answer! should consider fixing the math symbols not properly displayed. — user1559897, Nov 20 '19 at 14:41

score 9 · Answer 4 · edited Nov 22 '18 at 03:29

I will try to give you less informal definition for the same.

P problems : problems that can be solved in polynomial time. Contains problems which can be efficiently solvable.

NP problem : problems that can be verified in polynomial time. For example:Travelling salesman, circuit designing. NP problems are kind of like puzzles (like sudoku). Given a correct solution for the problem we can check our solution very fast but if we actually try to solve it it might just take forever.

Now, P vs NP actually asks if a problem whose solution can be quickly checked to be correct, then is there always a fast way to solve it. Thus writing in mathematically terms: is NP a subset of P or not?

Now coming back to NP complete: these are the really hard problems of the NP problems. Therefore if there's a faster way to solve NP complete then NP complete becomes P and NP problems collapse into P.

NP hard: problems that can't be even checked in the polynomial time are np hard. For example, choosing the best move in chess is one of them.

If something remains unclear, try watching this video: https://www.youtube.com/watch?v=YX40hbAHx3s

I hope this will provide some blurry contour.

Trying to understand P vs NP vs NP Complete vs NP Hard

4 Answers4

Addendum

Linked