Encoding text to image and decoding back to text

Question

(FYI: I asked this question on stack overflow and was directed here)

My problem is as follows:

I have a dashboard running on a server at customer premises which updates every 15 minutes
The dashboard has multiple views
The dashboard has hundreds of KPIs (key performance indicators)
I want to replicate the dashboard at my computer remotely
The server is not allowed to send any emails directly
Customer only allows to move snapshots of the screen out of their premises, using a Citrix client
I can capture any report visible on the Citrix client screen (I can also change the report on the screen)

Currently, I have a scheduled task to capture relevant screens every 15 minutes and send it over to me via email. I want to be able to store the results long-term in a database, for which I need to read the text out of these images. From what I have read, generating QR codes seems to be a good option.

I am thinking of:

Generating a CSV file of the KPIs used in the dashboard
Use QR encoding libraries in Python to encode the CSV file as QR code
Show the generated QR code on the screen (possibly multiple QR codes)
Capture the screen and send it over via email
Use QR decoding libraries in Python to decode the QR
Possibly mix the above process with OCR to cater for any decoding conflicts

My specific question is to know if there is a better solution to moving text information as image before getting into the implementation. Thank you!

Note: Although not a requirement, but I would likely be working with Python because current dashboard backend is Python based.

Yes, there is a "better" (I think you mean simpler) solution: tell the client that their restrictions on only sending screenshots is bullshit since you are going to circumvent it either. — Doc Brown, Jan 12 '23 at 07:44
Sorry, I'm struggling with the snapshot thing. Don't you have access or control over the server side? _(I can also change the report on the screen)_ sounds like you have it. What does _I have a scheduled task to capture relevant screens every 15 minutes and send it over to me via email_ mean? If you have such a task on the client side, I don't know why you don't simply make the dashboard dump the data into a file and send the file, instead of the snapshot. Who set the task? In other words, what's your role here? — Laiv, Jan 12 '23 at 08:13
@DocBrown Unfortunately, the client is not going change the SOP. A copy of the image I email is captured by them as well for monitoring purposes. They are not inclined to do it any other way due to their policies (and possibly lack of will). — Imtiaz, Jan 12 '23 at 08:16
@Laiv Thank you for taking the time. I have created the dashboard that run on the server. I can enhance the dashboard to include all kinds of reports. However, I cannot send anything out of the server. There is another *client* computer sitting at customer premise from where I connect with the server via Citrix client. I can access the dashboard from this client. I can either go to the client office and have access to this client computer or send screen captures from it as I stated in my original question. Hope it clears it up a little. Thanks. — Imtiaz, Jan 12 '23 at 08:26
@Imtiaz: if I understood this correctly, you can send any kind of image you like by email, right? I mean, you can create an arbitrary image, show it on the screen, make a pixel-accurate screenshot and send it to yourself? Then you can take an arbitrary file, interpret the bytes as RGB values and encode it that way as a picture. That is about 100 times simpler than this QR code nonsense. — Doc Brown, Jan 12 '23 at 08:29
@DocBrown that sounds interesting. Could you elaborate a bit on your solution? — Laiv, Jan 12 '23 at 08:33
... of course, when the screenshots may contain defects, then using an error correction encoding may become necessary, and QR codes are indeed a way to achieve this (but probably not the most effective one). How is the quality of the screenshots? Do you get the exact original colors, pixel-by-pixel into them? — Doc Brown, Jan 12 '23 at 08:35
@DocBrown As I understand, you are suggesting simple OCR if the captures are good quality? The captures are, in fact, high resolution - don't know if it is exact but pretty close match. — Imtiaz, Jan 12 '23 at 08:39
@Imtiaz: no, I don't suggest OCR. I think you still approach this way-too-complicated. But we need to know precisely about the screenshot quality: for example, lets say you include a black/white image with 3x3 pixels (9 bits) in one of your reports. Now you make a screenshot and send it per email to you. Is it guaranteed the 9 bits will reach you in the original form? — Doc Brown, Jan 12 '23 at 08:44
@Laiv: grouping the bytes of a file in tuples of 3 is not rocket science. However, as long as the OP does not tell us if they can make screenshots which keep the original RGB values, it is not clear if this may be "too simplistic". — Doc Brown, Jan 12 '23 at 08:59
@DocBrown Thank you for your inputs and fixing the initial question. I will need some time to answer your question; as I will have to get client permission to make server-side software changes (to put the image on server side in the manner you suggested), visit the client to deploy the changes and then test the image quality of the received image. — Imtiaz, Jan 12 '23 at 09:27

Doc Brown · Accepted Answer · 2023-01-13T06:38:02.220

If your only way of transporting data in a machine-readable way from one system to another is by making screenshots, your idea of using QR codes is surely an approach which will technically work and could even bridge the gap between a monitor where the images are displayed, and a physically separated camera which makes pictures of that monitor.

However, for your actual situation, this might not be a very efficient way, neither efficient in data density (huge images for only few data), nor in developer efficiency. I am under the impression you are cracking a nut with a sledgehammer (though QR code libs today will probably easy to find and to use).

In case you can make pixel-accurate screenshots, which preserve the original RGB color of each pixel, a way more simple approach would be to

interpret you original text file as bytes
group the bytes into tuples of three
interpret the the 3-tuples as RGB values of an image, which you display as a report
the screenshot then has to be saved using some lossfree picture format like PNG

The decoding then works the other way round. This gives you 24 bits of payload per pixel.

Of course, if your screenshots are not 100% accurate and may incorporate some data loss or color failure, you have to invest more work:

if only color-correctness is your problem, you could reduce the number of used colors, maybe down to black & white. That would reduce data density to 1 bit per pixel
if detecting the exact position of the image is a problem, add some easy to find edge markers to it
if you are loosing a few pixels now and then, you could utilize some error-correcting code like a Reed-Solomon encoding. Of course, QR codes use this kind of error correction too, but I would expect by using such a code directly, you can still gain a lot higher data density. See this stackoverflow Q&A for a Python example.

In the end, it all depends on the quality of your transport channel. You may need to apply some error correction algorithm, and there are several different available, for all kinds of purposes.

On OCR: using standard latin characters as a "coding scheme" makes the data readable for humans, but unnecessarily hard to read for a machine and more error prone than necessary. I would heavily recommend against going that route if there are no other contraints which force you to.

Let me add a final word about your case as a whole: I guess the restrictions on getting data only "by screenshot" were originally intended to let the data only be accessed manually, by a human. Before you invest so much effort here to circumvent this restriction, you should explain the client what you are going to do. I would expect this to have one of the two following outcomes:

they understand that you are foiling there original safety measures and will forbid it
they understand that their literal requirements are only creating an artificial hurdle, and that in case they pay you, they are just producing extra developer costs for themselve for no apparent benefit

Thank you for the detailed answer. It helps. I am unable to upvote because of my user restrictions. I will follow up and accept the answer after waiting some more for other answers. — Imtiaz, Jan 13 '23 at 05:51

score 1 · Answer 2 · answered Jan 12 '23 at 13:07

Check the screenshot quality first. The simplest solution may be to get a really small bitmap font (e.g. 5x3 pixel), remove all the line breaks and as much whitespace as you can, then render the text to the screen.

That lets the client actually read it if they are suspicious about exfiltration, and it lets you OCR it with 100% accuracy by bit-accurate matching of the original font.

Encoding text to image and decoding back to text

2 Answers2