Hi everybody,


Probably something very simple, but i'm having trouble downloading PDF files from our database. 


Situation:


We have a table in an Oracle DB with a BLOB column containing a PDF file. Downloading it in Oracle SQL developer results in a valid PDF file (the column says it should contain app. 1.200.000 bytes, the resulting PDF is app. this size...). 

In the module (web, NOT mobile) i query this table and retreive the binary data from the column in a local binary data variable. In debug mode it shows this variable hold the same amount of bytes. Now i put this variable in the File content of the Download element...

All goes well, a PDF is downloaded. However: THe resulting PDF is not 1.2MB but 1.8MB and it gives an error on opening: Not a valid PDF. 


It looks like some sort of encoding issue, but i am lost at finding how i can correct this. 

I hope someone has an illuminating idea.


K R

Alexander

HI Alexander,

I've never experienced this: getting Binary Data from the database, then Downloading it via the Download widget should never give encoding problems. Only when you accidentally have some Text variable assign somewhere could encoding problems arise.

Kilian Hekhuis wrote:

HI Alexander,

I've never experienced this: getting Binary Data from the database, then Downloading it via the Download widget should never give encoding problems. Only when you accidentally have some Text variable assign somewhere could encoding problems arise.

I agree. Same here, that's why it's so mind boggling. Strange part is that it is one of the most simple download actions i've made. But i can not think of any other reason than encoding issues here. 

1) THe extension that maps it to the DB recognises it as a blob and suggests it being a "Binary Data" type...

2) The amount of bytes being the same in the DB and in the debug mode and in the data preview mode of the aggregate retreiving the record.... 

Just the resulting file is of a different size and content... Comparing the original file and the downloaded one in notepad++ shows the same type of content (e.g. paragraphs of unreadable jibberish), just all different characters.... Hence my guess that somewhere the encoding kicks in???


THe variable the aggregate puts the binary data into is of the binary data type as well. I also had the content of the aggregate column assigned directly to the download file content without first assigning it to the variable, it is just that the variable is filled in a different screen action (preparation) than the one that is called for the download, hence the use of a variable...


Been fooling around with the BinaryData API. Specifically the ConvertEncoding part. Seems i can get the right amount of bytes (save 7 bytes) is i convert from unicode to utf-8. Which is really weird because the content does change. Here are three situations, all viewed in notepad++:


Original (working, 1.225.294 bytes):


"Normal" download (just content from DB right into the download element, not working pdf,  1.877.074 bytes):


Manipulate file content through ConvertEncoding(PdfDoc_ALV,"unicode", "utf-8") (not working pdf, 1.225.287 bytes):


Anyone?


Regards,

Alexander

Hi Alexander,

Just jumped into my brain: what's the Mime-Type of the download? I can imagine if it's wrong (or unrecognized by the browser), the browser tries to do some kind of conversion. Is it set to "application/octet-stream"? If not, can you try with that?

Hi Kilian,


Tried both application/pdf and application/octet-stream. Same results. 


Alexander

Hello Alexander

This really is a long shot, but have you tried application/x-pdf?


Best regards,

Ângelo

@kilian: 

I know application/octet-stream should guarantee the bytes to be sent over unchagned (apart from conversion to text i think), but we tested with a PDF document with only onle word, "test", in it generated on a windows PC. Still it was not able to show this document through the download functionality retreived from an Oracle DB connection. 

So either Oracle cheats and sends the BLOB as less of a binary data than it acutally is (although it ends up with the correct amount of bytes), or IIS is not able to translate these into the correct BASE64 (isn't that how content is delivered over the internet???). In case of the latter: 

Could it be that on our development server a codepage is missing or something I see it is ANSI encoded according to notepad++. ANSI needs a codepage i think... 

Would be handy to see how it tries to treat the document. 


Alexander


@Ãngelo: Tried it, did not work. 

Just to narrow things down. I printed the variable on the screen (with BinaryDataToText(PdfDoc_ALV)):



Compare this to the original (few posts higher) and you see that the part after %PDF-1.5% is the same as the original file viewed in notepad++ ..... Are we getting somewhere? 

It seems it goes wrong in the handling of the binary data. What goes wrong? 





Getting there. This is how it WILL work:


1) I do a BinaryToBase64(PdfDoc_ALV) and write this output to the screen.

2) I open notepad++ create a new file, ANSI coding

3) I Paste the Base64 string from the sourcecode of the page into the new file

4) I choose plugins (apparently an installed pulgin) --> mime converters --> base64 decode

5) I save the file with a PDF extension

6).... This works!!!!!!!

How is this possible? This proves the binary data read and held in the variable is the correct data, but the download functionality is not able to produce a correct file representation from it... 


Any input would be appreciated


Kind regards,

Alexander

Is there a PDF you can share? I'd like to see whether downloading it will produce a correct file or not.

I will post :

1) the original pdf

2) the base 64 string downloaded from DB Record BloB (BinaryToBase64)

3) the generatead pdf from that string

Base 64

generated pdf

Here also the download OS offers...

Ok, I'll take a look at it tomorrow, day's up :)

Hello :)

I have take a look in your PDF and it has nothing wrong with it (the original.
I can upload and download it without problem in a server with Oracle database (see attached OML).

So, I would say that your code to "upload" the file to the database is changing it somehow.

Cheers.

Thanks for looking Eduardo.. Upload / download combi has indeed been working for us as well (we have several test apps in which this works like a charm). 

Trouble here is that the upload is done through a backend application programmed in Java. Still, our old .Net (DotNetNuke) application retreives the PDF's as blob without any problem, Also going through the process of the base64 encoding - downloading that - base64 decoding which welivers the correct PDF suggests that we SHOULD be able to somehow get a functional PDF to download. 

This is also why i figured it might be some encoding issue. Although Binary Data is Binary Data, it still was sent to the DB over an internet connection through a web server and, as i stated before, i think those transfers are always done in base64 text.... So yes, it could have something to do with the upload, but there IS a way to read the file (through base64 route), and the old .net app can read it, so i would like to know what to do to get OutSystems to read it and serve it.

Kind regards,

Alexander

Well, if the file binary is being converted to base64, what is indeed an "upload" issue, you can use the binary extension, where you have a method to convert base64  to binary, before download.

See here: https://www.outsystems.com/help/ServiceStudio/9.0/APIs/BinaryData_API.htm

This probably will solve the problem.

Cheers.

Eduardo Jauch

Eduardo Jauch wrote:

Well, if the file binary is being converted to base64, what is indeed an "upload" issue, you can use the binary extension, where you have a method to convert base64  to binary, before download.

This probably will solve the problem.

Cheers.

Eduardo Jauch


Eduardo,

The base64 example is only to test the blob column data content. Was for me to show that the correct data was in there. It is not stored as base64, i specifically base64 encoded the blob column out of my aggregate to be able to get the content in a base64 encoded form (which i copied out of the source of the webpage to use elsewhere).

I just always understood that ALL binary data was sent over the internet in a base64 encoded string (and decoded on the webserver side back to binary data)...

Although i havn't looked very far in this java code, the upload code is very old, it uses a base64 decode to retreive the data and then write the pdf to the database (in .net i never had to do this manually, it happened under water but i can understand it would ne necessary). Something could of course go wrong, but again: The hint that i AM able to view to document in an older .Net application and that i AM able to generate a functional pdf of the base64 encode representation of the column leads me to beleive it should be possible. 


PS, forgot to mention.... For testing issue i uploaded the pdf documents found in one of the older posts to the database using the SQL Developer tool directly, so not through a user interface..... Same issue, so i don't think it is the upload functionality unless Oracle SQL developer also makes the same "mistake", 

Ok, so summerizing:

  1. There's binary content in the database;
  2. Retrieving that content and downloading it via a Download results in a corrupted file (in Alexander's app);
  3. Retrieving that content, converting it to Base64 and downloading the result, converting it back to binary once downloaded, results in the correct file;
  4. A test program made by Eduardo to upload and download that same file works fine;
  5. An older .NET application that retrieves the same binary content from the database works fine.

@Alexander: can you share the properties of the Entity Attribute that contains the binary data in the database? It seems there is a conversion of the binary data somehow when downloading, but only when doing so from its original Entity, not via Eduardo's test Entity.

The by oracle generated create table statement should contain all the details:


  CREATE TABLE "GRN_DCT_PDF_DOCUMENTEN"

   ("DOC_ID" NUMBER NOT NULL ENABLE,

"DOC_TIMESTAMP" DATE NOT NULL ENABLE,

"GRN_OWNER_ID" NUMBER NOT NULL ENABLE,

"GRN_ACTOR_ID" NUMBER NOT NULL ENABLE,

"DOC_BLOB" BLOB NOT NULL ENABLE,

 CONSTRAINT "GRN_DOC_PK" PRIMARY KEY ("DOC_ID")

  USING INDEX PCTFREE 10 INITRANS 2 MAXTRANS 255

  STORAGE(INITIAL 65536 NEXT 1048576 MINEXTENTS 1 MAXEXTENTS 2147483645

  PCTINCREASE 0 FREELISTS 1 FREELIST GROUPS 1

  BUFFER_POOL DEFAULT FLASH_CACHE DEFAULT CELL_FLASH_CACHE DEFAULT)

  TABLESPACE "XXXX"  ENABLE

   ) SEGMENT CREATION IMMEDIATE

  PCTFREE 10 PCTUSED 40 INITRANS 1 MAXTRANS 255

 NOCOMPRESS LOGGING

  STORAGE(INITIAL 65536 NEXT 1048576 MINEXTENTS 1 MAXEXTENTS 2147483645

  PCTINCREASE 0 FREELISTS 1 FREELIST GROUPS 1

  BUFFER_POOL DEFAULT FLASH_CACHE DEFAULT CELL_FLASH_CACHE DEFAULT)

  TABLESPACE "XXXX"

 LOB ("DOC_BLOB") STORE AS BASICFILE (

  TABLESPACE "XXXX" ENABLE STORAGE IN ROW CHUNK 8192 RETENTION

  NOCACHE LOGGING

  STORAGE(INITIAL 65536 NEXT 1048576 MINEXTENTS 1 MAXEXTENTS 2147483645

  PCTINCREASE 0 FREELISTS 1 FREELIST GROUPS 1

  BUFFER_POOL DEFAULT FLASH_CACHE DEFAULT CELL_FLASH_CACHE DEFAULT)) ;


Stupid, might be crucial information: This is NOT the OutSystems database (although that is also oracle), it is our "central"  database). However, we use this for a lot of Outsystems Applications now. We create extensions for this. The extension for this table looks like this:


The aggregate:

The database characterset:

SELECT * FROM NLS_DATABASE_PARAMETERS

where parameter = 'NLS_CHARACTERSET'

 

WE8ISO8859P1


Could not think of any other info that could help...


Alexander

Images not showing right. Will upload them. 


Aggregate

extension

Thanks, I see the attribute is indeed Binary Data, so that shouldn't be a problem. But I have little experience with external Oracle DBs, so I don't know whether there's any possible conversion going on (but probably not, if the base64 encoding produces the right results).

Please check that your database does have the right encoding (and the same as outsystems expects)

It could be using utf-16 or some basic latin.


If that not solve your issue, I think there is also an issue with Outsystems itself, because they are doing some stuff between database and your logic, which could alter the data.

Kilian Hekhuis wrote:

Thanks, I see the attribute is indeed Binary Data, so that shouldn't be a problem. But I have little experience with external Oracle DBs, so I don't know whether there's any possible conversion going on (but probably not, if the base64 encoding produces the right results).


Stupid thing is that the file it servers looks a lot like what it should be. It just has a lot of ? (or blank) instead of ä kind of characters.... So it almost looks like the webserver itsself is somehow interpreting the binary data. I'm having a hard time beleiving it is the extension or something on the database side, especially since i  am able to download the base63 encoded string, decode it and end up with a correct PDF file.... That sort of prooves that the column in OutSystems still is in some kind of unerstandable format. 


Hi J.,


You mean this one?


SELECT * FROM NLS_DATABASE_PARAMETERS

where parameter = 'NLS_CHARACTERSET'

 

WE8ISO8859P1

Hello Alexander.

Your corrupted PDF file is the result of trying to interpret the file as UTF-8. Namely, every byte that is not valid UTF-8 gets converted to 3 bytes: EFBFBD (which is rendered as a empty square in your notepad).


You mentioned that this file is being uploaded through a Java application. It would be very easy to use an incorrect API in Java which tried to read the PDF file as a text. For instance, the following Java code produces a corrupt PDF file very similar to yours, and could be fixed by using FileInputStream and FileOutputStream:

try (FileReader fr = new FileReader("TEST.pdf")) {
    try (FileWriter fw = new FileWriter("Download result.pdf")) {
        int c;
        while ((c = fr.read()) != -1) {
            fw.write(c);
        }
    }
}


Please check that the Java app that is performing the upload is not corrupting the file.

Also, please check that you are not using the BinaryDataToText action from the BinaryData extension, which could also corrupt the PDF file in a similar way.

leonardo.fernandes wrote:

Hello Alexander.

Your corrupted PDF file is the result of trying to interpret the file as UTF-8. Namely, every byte that is not valid UTF-8 gets converted to 3 bytes: EFBFBD (which is rendered as a empty square in your notepad).


You mentioned that this file is being uploaded through a Java application. It would be very easy to use an incorrect API in Java which tried to read the PDF file as a text. For instance, the following Java code produces a corrupt PDF file very similar to yours, and could be fixed by using FileInputStream and FileOutputStream:

try (FileReader fr = new FileReader("TEST.pdf")) {
    try (FileWriter fw = new FileWriter("Download result.pdf")) {
        int c;
        while ((c = fr.read()) != -1) {
            fw.write(c);
        }
    }
}


Please check that the Java app that is performing the upload is not corrupting the file.

Also, please check that you are not using the BinaryDataToText action from the BinaryData extension, which could also corrupt the PDF file in a similar way.

Hi Leonardo,


I actually uploaded the test.pdf found somewhere in the posts above through SQL developer to rule-out any JAVA issues for that specific reason. But the fact that if you DOWNLOAD directly from Oracle SQL developer you get a valid PDF sort of convinced me the upload was not the problem.... As to the BinaryDataToText: Used it, just as every other function i could find in the binarydata api but to no avail, so ended up removing all references to the binarydata api. 


Kind regards,

Alexander


Hi Alexander. Based on that information, we can rule out the upload mechanism.

Which version of the platform are you working on? You can find the version in Service Center. On-premises or cloud? Is it a .NET or Java stack? If Java, which application server?

I want to check whether a new app has the same symptoms. If it does, then I guess that your application server or some network equipment is misbehaving. If it doesn't, then your app is probably to blame, and we just need to narrow down the root cause.

Could you create a brand new application and create a web screen that downloads the same file? You can do this easily if you store the PDF in the Resources folder. This article explains how to do it: https://success.outsystems.com/Documentation/10/Developing_an_Application/Use_Data/Use_Resources#use-resources-to-allow-downloading-a-file

Grrrrr... Just lost 15 minutes typing. Summary:


Found extra info. In brand new app every possible scene works like a charm. ALSO a scene that uses the same oracle connector extension as the original problem-app.... 

After this i uploaded the (correct and working) PDF as a site resource into the problem-app and tried to download that. Same error. Just to be sure i made two exactly the same download nodes. One in problem app, one in new app. It works in new app, not in problem app...

Hence: The trouble must lie in the problem-app somehow...

I made a perfect copy of all relevant actions and aggregates into the new app where it DID work. 

Our module setup:

XXX_UX (page) - Seperate Espace, generic start(or end, depends on how you look) point for all functionality

XXX_EU (Webblock) - Problem espace

XXX_LIB (Oracle connection extension) - Problem espace

XXX_CORE (some other functionality, not used here) - Problem espace


Where can i start to hunt down this issue?

Kind regards,

Alexander


leonardo.fernandes wrote:

Hi Alexander. Based on that information, we can rule out the upload mechanism.

Which version of the platform are you working on? You can find the version in Service Center. On-premises or cloud? Is it a .NET or Java stack? If Java, which application server?

I want to check whether a new app has the same symptoms. If it does, then I guess that your application server or some network equipment is misbehaving. If it doesn't, then your app is probably to blame, and we just need to narrow down the root cause.

Could you create a brand new application and create a web screen that downloads the same file? You can do this easily if you store the PDF in the Resources folder. This article explains how to do it: https://success.outsystems.com/Documentation/10/Developing_an_Application/Use_Data/Use_Resources#use-resources-to-allow-downloading-a-file


PS. I was answering your mail when i lost my content. The answers to your questions:

It is Version 10.0.710.0 running on a .Net stack. It runs in the amazon cloud i beleive

PS. The espace of the problem app WAS originally created in version 9. It is one of our oldest espaces (2015/2016).. All was up;graded to 10 succesfully a few months ago. 

To keep you updated:

I made the new webblock in the test application public and included that in the problem-app... Did not work. Added an extra page in the above-lying _UX module and included the webblock there... Did not work. So it seems to be wider than the problem-app. 


I then used the same THeme (custom made) in the test app... Works like a charm (not the theme)... Running out of options...

Solution

Hello Alexander. I'm glad you've made some progress on this issue.

I would start looking into extension calls. That's the only way to manipulate IO streams that could affect how the HTTP response is sent.

My main suspect now is some call to the extension HTTPRequestHandler. This extension can be used to inject things directly in the HTML, for example external scripts and stylesheets, <meta> tags, etc. If such a call is made but the response is a PDF (not an HTML), then I'm not sure what would happen.

Also check if you have an action called OnBeginWebRequest. This action is executed even before the preparation and, again, it could ruin your download if it messes with the underlying HTTP streams.

Solution

leonardo.fernandes wrote:

Hello Alexander. I'm glad you've made some progress on this issue.

I would start looking into extension calls. That's the only way to manipulate IO streams that could affect how the HTTP response is sent.

My main suspect now is some call to the extension HTTPRequestHandler. This extension can be used to inject things directly in the HTML, for example external scripts and stylesheets, <meta> tags, etc. If such a call is made but the response is a PDF (not an HTML), then I'm not sure what would happen.

Also check if you have an action called OnBeginWebRequest. This action is executed even before the preparation and, again, it could ruin your download if it messes with the underlying HTTP streams.

@leonardo: Thanks a lot.... Took me a few days of carefull elimination, buit i finally found the bugger... Something i had seen AND DISCARDED as a potential problem earlier... The AddFaviconTag from the HTTPRequestHandler that runs off our OnBeginWebRequest... 

Marked this one as solved!!!


Kind regards,

Alexander


That's some great news right there!

Try putting the AddFaviconTag into the preparation of your Layout block. That way, it will only execute when the layout is rendered - and not when you're trying to download a file.