Edited to directly address Massimo's requirement that no external code be used (a.k.a. clean room development).
At the minimum, one has to implement all of the mechanics to parse (but not render) PDF document structure, according to the PDF specification, such as ISO32000-1:2008, PDF 1.7 (purchase needed.)
PDF document structure is a very big and deep tree, comparable to some of the world's most complicated XML documents. The "schema" of PDF is documented in the file format specification, in the linked document above.
Once you arrived at a document page's "content stream" node (there can be one or more, or nested), or the "Form XObject" nodes, nested inside a page, you can parse its content, to look for content that is delimited by "BT" and "ET". There are lots of other things to watch out, though.
Text content streams are almost always Deflate-compressed (commonly associated with the "zlib" library). So you will need to find the beginning of deflated streams, find the length of the stream, decompress it, and search for strings inside.
To find images, you will need to find "Image XObjects" and a few others that can contain imaged data.
This is an incomplete explanation and is only applicable to those who are not using libraries or tools.
There are many approaches. All of them require libraries or tools; some are open-source or liberally licensed. Yet others require commercial licenses for commercial use, including use in hosted services. So a quick answer is no.
All content stream, including text or images, are compressed. Some of them are Deflate-compressed, but a dozen other compression methods (called filters) are used in PDF.
Tools exist to extract plain text from PDF. PDF that contain scanned but not recognized text (i.e. human-readable, non-machine-readable) must be rendered to image and then OCRed.
Certain types of PDF contains neither text nor images: they contain vector graphic commands that render text strokes, one stroke at a time. The capital letter "A" may be stored as several strokes, for example. For this type of PDF, again it is necessary to render these graphic commands, producing an image that is human-readable, and then OCRed into machine-readable text.
There is no quick way to parse the "PDF header". In fact, just parsing the "PDF header" is complicated enough that will make you a famous programmer and a CEO of a tech company.
To estimate whether a task is doable or not with PDF, it is important to have a first-hand view of a PDF file's structure. Here are some examples of tools used to obtain a tree-view of structure:
(Notes: There are many such tools. This is not an advertisement or endorsement. These diagnostic tools are for a programmer's use to understand a PDF structure, not software components that can be integrated or shipped or resold.)
I haven't used Poppler but it seems promising.
This part is off-topic but gives a reminder to others that the clean-room requirement can be relevant if, for example:
- One is implementing this functionality in a new programming language (think of some of the rising stars e.g. Go and Rust).
- One is implementing this in a language without creating external processes or calling foreign-language functions (such as implementing this in Javascript that run on browsers)
- One is implementing this in hardware, where the basic computational element is stream processing (compress, decompress, filter) and string pattern matching, such as firewalls and novel architecture e.g. Automata Processor